Home » Who’s got the other end of this unix socketpair?

Who’s got the other end of this unix socketpair?

Solutons:


Note: I now maintain a lsof wrapper that combines both approaches described here and also adds information for peers of loopback TCP connections at https://github.com/stephane-chazelas/misc-scripts/blob/master/lsofc

Linux-3.3 and above.

On Linux, since kernel version 3.3 (and provided the UNIX_DIAG feature is built in the kernel), the peer of a given unix domain socket (includes socketpairs) can be obtained using a new netlink based API.

lsof since version 4.89 can make use of that API:

lsof +E -aUc Xorg

Will list all the Unix domain sockets that have a process whose name starts with Xorg at either end in a format similar to:

Xorg       2777       root   56u  unix 0xffff8802419a7c00      0t0   34036 @/tmp/.X11-unix/X0 type=STREAM ->INO=33273 4120,xterm,3u

If your version of lsof is too old, there are a few more options.

The ss utility (from iproute2) makes use of that same API to retrieve and display information on the list of unix domain sockets on the system including peer information.

The sockets are identified by their inode number. Note that it’s not related to the filesystem inode of the socket file.

For instance in:

$ ss -x
[...]
u_str  ESTAB    0    0   @/tmp/.X11-unix/X0 3435997     * 3435996

it says that socket 3435997 (that was bound to the ABSTRACT socket /tmp/.X11-unix/X0) is connected with socket 3435996. The -p option can tell you which process(es) have that socket open. It does that by doing some readlinks on /proc/$pid/fd/*, so it can only do that on processes you own (unless you’re root). For instance here:

$ sudo ss -xp
[...]
u_str  ESTAB  0  0  @/tmp/.X11-unix/X0 3435997 * 3435996 users:(("Xorg",pid=3080,fd=83))
[...]
$ sudo ls -l /proc/3080/fd/23
lrwx------ 1 root root 64 Mar 12 16:34 /proc/3080/fd/83 -> socket:[3435997]

To find out what process(es) has 3435996, you can look up its own entry in the output of ss -xp:

$ ss -xp | awk '$6 == 3435996'
u_str  ESTAB  0  0  * 3435996  * 3435997 users:(("xterm",pid=29215,fd=3))

You could also use this script as a wrapper around lsof to easily show the relevant information there:

#! /usr/bin/perl
# lsof wrapper to add peer information for unix domain socket.
# Needs Linux 3.3 or above and CONFIG_UNIX_DIAG enabled.

# retrieve peer and direction information from ss
my (%peer, %dir);
open SS, '-|', 'ss', '-nexa';
while (<SS>) {
  if (/s(d+)s+*s+(d+) ([<-]-[->])$/) {
    $peer{$1} = $2;
    $dir{$1} = $3;
  }
}
close SS;

# Now get info about processes tied to sockets using lsof
my (%fields, %proc);
open LSOF, '-|', 'lsof', '-nPUFpcfin';
while (<LSOF>) {
  if (/(.)(.*)/) {
    $fields{$1} = $2;
    if ($1 eq 'n') {
      $proc{$fields{i}}->{"$fields{c},$fields{p}" .
      ($fields{n} =~ m{^([@/].*?)( type=w+)?$} ? ",$1" : "")} = "";
    }
  }
}
close LSOF;

# and finally process the lsof output
open LSOF, '-|', 'lsof', @ARGV;
while (<LSOF>) {
  chomp;
  if (/sunixs+S+s+S+s+(d+)s/) {
    my $peer = $peer{$1};
    if (defined($peer)) {
      $_ .= $peer ?
            " ${dir{$1}} $peer[" . (join("|", keys%{$proc{$peer}})||"?") . "]" :
            "[LISTENING]";
    }
  }
  print "$_n";
}
close LSOF or exit(1);

For example:

$ sudo that-lsof-wrapper -ad3 -p 29215
COMMAND   PID     USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
xterm   29215 stephane    3u  unix 0xffff8800a07da4c0      0t0 3435996 type=STREAM <-> 3435997[Xorg,3080,@/tmp/.X11-unix/X0]

Before linux-3.3

The old Linux API to retrieve unix socket information is via the /proc/net/unix text file. It lists all the Unix domain sockets (including socketpairs). The first field in there (if not hidden to non-superusers with the kernel.kptr_restrict sysctl parameter) as already explained by @Totor contains the kernel address of a unix_sock structure that contains a peer field pointing to the corresponding peer unix_sock. It’s also what lsof outputs for the DEVICE column on a Unix socket.

Now getting the value of that peer field means being able to read kernel memory and know the offset of that peer field with regards to the unix_sock address.

Several gdb-based and systemtap-based solutions have already been given but they require gdb/systemtap and Linux kernel debug symbols for the running kernel being installed which is generally not the case on production systems.

Hardcoding the offset is not really an option as that varies with kernel version.

Now we can use a heuristic approach at determining the offset: have our tool create a dummy socketpair (then we know the address of both peers), and search for the address of the peer around the memory at the other end to determine the offset.

Here is a proof-of-concept script that does just that using perl (successfully tested with kernel 2.4.27 and 2.6.32 on i386 and 3.13 and 3.16 on amd64). Like above, it works as a wrapper around lsof:

For example:

$ that-lsof-wrapper -aUc nm-applet
COMMAND    PID     USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
nm-applet 4183 stephane    4u  unix 0xffff8800a055eb40      0t0 36888 type=STREAM -> 0xffff8800a055e7c0[dbus-daemon,4190,@/tmp/dbus-AiBCXOnuP6]
nm-applet 4183 stephane    7u  unix 0xffff8800a055e440      0t0 36890 type=STREAM -> 0xffff8800a055e0c0[Xorg,3080,@/tmp/.X11-unix/X0]
nm-applet 4183 stephane    8u  unix 0xffff8800a05c1040      0t0 36201 type=STREAM -> 0xffff8800a05c13c0[dbus-daemon,4118,@/tmp/dbus-yxxNr1NkYC]
nm-applet 4183 stephane   11u  unix 0xffff8800a055d080      0t0 36219 type=STREAM -> 0xffff8800a055d400[dbus-daemon,4118,@/tmp/dbus-yxxNr1NkYC]
nm-applet 4183 stephane   12u  unix 0xffff88022e0dfb80      0t0 36221 type=STREAM -> 0xffff88022e0df800[dbus-daemon,2268,/var/run/dbus/system_bus_socket]
nm-applet 4183 stephane   13u  unix 0xffff88022e0f80c0      0t0 37025 type=STREAM -> 0xffff88022e29ec00[dbus-daemon,2268,/var/run/dbus/system_bus_socket]

Here’s the script:

#! /usr/bin/perl
# wrapper around lsof to add peer information for Unix
# domain sockets. needs lsof, and superuser privileges.
# Copyright Stephane Chazelas 2015, public domain.
# example: sudo this-lsof-wrapper -aUc Xorg
use Socket;

open K, "<", "/proc/kcore" or die "open kcore: $!";
read K, $h, 8192 # should be more than enough
 or die "read kcore: $!";

# parse ELF header
my ($t,$o,$n) = unpack("x4Cx[C19L!]L!x[L!C8]S", $h);
$t = $t == 1 ? "L3x4Lx12" : "Lx4QQx8Qx16"; # program header ELF32 or ELF64
my @headers = unpack("x$o($t)$n",$h);

# read data from kcore at given address (obtaining file offset from ELF
# @headers)
sub readaddr {
  my @h = @headers;
  my ($addr, $length) = @_;
  my $offset;
  while (my ($t, $o, $v, $s) = splice @h, 0, 4) {
    if ($addr >= $v && $addr < $v + $s) {
      $offset = $o + $addr - $v;
      if ($addr + $length - $v > $s) {
        $length = $s - ($addr - $v);
      }
      last;
    }
  }
  return undef unless defined($offset);
  seek K, $offset, 0 or die "seek kcore: $!";
  my $ret;
  read K, $ret, $length or die "read($length) kcore @$offset: $!";
  return $ret;
}

# create a dummy socketpair to try find the offset in the
# kernel structure
socketpair(Rdr, Wtr, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
 or die "socketpair: $!";
$r = readlink("/proc/self/fd/" . fileno(Rdr)) or die "readlink Rdr: $!";
$r =~ /[(d+)/; $r = $1;
$w = readlink("/proc/self/fd/" . fileno(Wtr)) or die "readlink Wtr: $!";
$w =~ /[(d+)/; $w = $1;
# now $r and $w contain the socket inodes of both ends of the socketpair
die "Can't determine peer offset" unless $r && $w;

# get the inode->address mapping
open U, "<", "/proc/net/unix" or die "open unix: $!";
while (<U>) {
  if (/^([0-9a-f]+):(?:s+S+){5}s+(d+)/) {
    $addr{$2} = hex $1;
  }
}
close U;

die "Can't determine peer offset" unless $addr{$r} && $addr{$w};

# read 2048 bytes starting at the address of Rdr and hope to find
# the address of Wtr referenced somewhere in there.
$around = readaddr $addr{$r}, 2048;
my $offset = 0;
my $ptr_size = length(pack("L!",0));
my $found;
for (unpack("L!*", $around)) {
  if ($_ == $addr{$w}) {
    $found = 1;
    last;
  }
  $offset += $ptr_size;
}
die "Can't determine peer offset" unless $found;

my %peer;
# now retrieve peer for each socket
for my $inode (keys %addr) {
  $peer{$addr{$inode}} = unpack("L!", readaddr($addr{$inode}+$offset,$ptr_size));
}
close K;

# Now get info about processes tied to sockets using lsof
my (%fields, %proc);
open LSOF, '-|', 'lsof', '-nPUFpcfdn';
while (<LSOF>) {
  if (/(.)(.*)/) {
    $fields{$1} = $2;
    if ($1 eq 'n') {
      $proc{hex($fields{d})}->{"$fields{c},$fields{p}" .
      ($fields{n} =~ m{^([@/].*?)( type=w+)?$} ? ",$1" : "")} = "";
    }
  }
}
close LSOF;

# and finally process the lsof output
open LSOF, '-|', 'lsof', @ARGV;
while (<LSOF>) {
  chomp;
  for my $addr (/0x[0-9a-f]+/g) {
    $addr = hex $addr;
    my $peer = $peer{$addr};
    if (defined($peer)) {
      $_ .= $peer ?
            sprintf(" -> 0x%x[", $peer) . join("|", keys%{$proc{$peer}}) . "]" :
            "[LISTENING]";
      last;
    }
  }
  print "$_n";
}
close LSOF or exit(1);

Since kernel 3.3, it is possible using ss or lsof-4.89 or above — see Stéphane Chazelas’s answer.

In older versions, according to the author of lsof, it was impossible to find this out: the Linux kernel does not expose this information. Source: 2003 thread on comp.unix.admin.

The number shown in /proc/$pid/fd/$fd is the socket’s inode number in the virtual socket filesystem. When you create a pipe or socket pair, each end successively receives an inode number. The numbers are attributed sequentially, so there is a high probability that the numbers differ by 1, but this is not guaranteed (either because the first socket was N and N+1 was already in use due to wrapping, or because some other thread was scheduled between the two inode allocations and that thread created some inodes too).

I checked the definition of socketpair in kernel 2.6.39, and the two ends of the socket are not correlated except by the type-specific socketpair method. For unix sockets, that’s unix_socketpair in net/unix/af_unix.c.

Since kernel 3.3

You can now get this information with ss:

# ss -xp

Now you can see in the Peer column an ID (inode number) which corresponds to another ID in the Local column. Matching IDs are the two ends of a socket.

Note: The UNIX_DIAG option must be enabled in your kernel.

Before kernel 3.3

Linux didn’t expose this information to userland.

However, by looking into kernel memory, we can access this information.

Note: This answer does so by using gdb, however, please see @StéphaneChazelas’ answer which is more elaborated in this regard.

# lsof | grep whatever
mysqld 14450 (...) unix 0xffff8801011e8280 (...) /var/run/mysqld/mysqld.sock
mysqld 14450 (...) unix 0xffff8801011e9600 (...) /var/run/mysqld/mysqld.sock

There is 2 different sockets, 1 listening and 1 established. The hexa number is the address to the corresponding kernel unix_sock structure, having a peer attribute being the address of the other end of the socket (also a unix_sock structure instance).

Now we can use gdb to find the peer within kernel memory:

# gdb /usr/lib/debug/boot/vmlinux-3.2.0-4-amd64 /proc/kcore
(gdb) print ((struct unix_sock*)0xffff8801011e9600)->peer
$1 = (struct sock *) 0xffff880171f078c0

# lsof | grep 0xffff880171f078c0
mysql 14815 (...) unix 0xffff880171f078c0 (...) socket

Here you go, the other end of the socket is held by mysql, PID 14815.

Your kernel must be compiled with KCORE_ELF to use /proc/kcore. Also, you need a version of your kernel image with debugging symbols. On Debian 7, apt-get install linux-image-3.2.0-4-amd64-dbg will provide this file.

No need for the debuggable kernel image…

If you don’t have (or don’t want to keep) the debugging kernel image on the system, you can give gdb the memory offset to “manually” access the peer value. This offset value usually differ with kernel version or architecture.

On my kernel, I know the offset is 680 bytes, that is 85 times 64 bits. So I can do:

# gdb /boot/vmlinux-3.2.0-4-amd64 /proc/kcore
(gdb) print ((void**)0xffff8801011e9600)[85]
$1 = (void *) 0xffff880171f078c0

Voilà, same result as above.

If you have the same kernel running on several machine, it is easier to use this variant because you don’t need the debug image, only the offset value.

To (easily) discover this offset value at first, you do need the debug image:

$ pahole -C unix_sock /usr/lib/debug/boot/vmlinux-3.2.0-4-amd64
struct unix_sock {
  (...)
  struct sock *              peer;                 /*   680     8 */
  (...)
}

Here you go, 680 bytes, this is 85 x 64 bits, or 170 x 32 bits.

Most of the credit for this answer goes to MvG.

Related Solutions

Joining bash arguments into single string with spaces

[*] I believe that this does what you want. It will put all the arguments in one string, separated by spaces, with single quotes around all: str="'$*'" $* produces all the scripts arguments separated by the first character of $IFS which, by default, is a space....

AddTransient, AddScoped and AddSingleton Services Differences

TL;DR Transient objects are always different; a new instance is provided to every controller and every service. Scoped objects are the same within a request, but different across different requests. Singleton objects are the same for every object and every...

How to download package not install it with apt-get command?

Use --download-only: sudo apt-get install --download-only pppoe This will download pppoe and any dependencies you need, and place them in /var/cache/apt/archives. That way a subsequent apt-get install pppoe will be able to complete without any extra downloads....

What defines the maximum size for a command single argument?

Answers Definitely not a bug. The parameter which defines the maximum size for one argument is MAX_ARG_STRLEN. There is no documentation for this parameter other than the comments in binfmts.h: /* * These are the maximum length and maximum number of strings...

Bulk rename, change prefix

I'd say the simplest it to just use the rename command which is common on many Linux distributions. There are two common versions of this command so check its man page to find which one you have: ## rename from Perl (common in Debian systems -- Ubuntu, Mint,...

Output from ls has newlines but displays on a single line. Why?

When you pipe the output, ls acts differently. This fact is hidden away in the info documentation: If standard output is a terminal, the output is in columns (sorted vertically) and control characters are output as question marks; otherwise, the output is...

mv: Move file only if destination does not exist

mv -vn file1 file2. This command will do what you want. You can skip -v if you want. -v makes it verbose - mv will tell you that it moved file if it moves it(useful, since there is possibility that file will not be moved) -n moves only if file2 does not exist....

Is it possible to store and query JSON in SQLite?

SQLite 3.9 introduced a new extension (JSON1) that allows you to easily work with JSON data . Also, it introduced support for indexes on expressions, which (in my understanding) should allow you to define indexes on your JSON data as well. PostgreSQL has some...

Combining tail && journalctl

You could use: journalctl -u service-name -f -f, --follow Show only the most recent journal entries, and continuously print new entries as they are appended to the journal. Here I've added "service-name" to distinguish this answer from others; you substitute...

how can shellshock be exploited over SSH?

One example where this can be exploited is on servers with an authorized_keys forced command. When adding an entry to ~/.ssh/authorized_keys, you can prefix the line with command="foo" to force foo to be run any time that ssh public key is used. With this...

Why doesn’t the tilde (~) expand inside double quotes?

The reason, because inside double quotes, tilde ~ has no special meaning, it's treated as literal. POSIX defines Double-Quotes as: Enclosing characters in double-quotes ( "" ) shall preserve the literal value of all characters within the double-quotes, with the...

What is GNU Info for?

GNU Info was designed to offer documentation that was comprehensive, hyperlinked, and possible to output to multiple formats. Man pages were available, and they were great at providing printed output. However, they were designed such that each man page had a...

Set systemd service to execute after fstab mount

a CIFS network location is mounted via /etc/fstab to /mnt/ on boot-up. No, it is not. Get this right, and the rest falls into place naturally. The mount is handled by a (generated) systemd mount unit that will be named something like mnt-wibble.mount. You can...

Merge two video clips into one, placing them next to each other

To be honest, using the accepted answer resulted in a lot of dropped frames for me. However, using the hstack filter_complex produced perfectly fluid output: ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack output.mp4 ffmpeg -i input1.mp4 -i input2.mp4...

How portable are /dev/stdin, /dev/stdout and /dev/stderr?

It's been available on Linux back into its prehistory. It is not POSIX, although many actual shells (including AT&T ksh and bash) will simulate it if it's not present in the OS; note that this simulation only works at the shell level (i.e. redirection or...

How can I increase the number of inodes in an ext4 filesystem?

It seems that you have a lot more files than normal expectation. I don't know whether there is a solution to change the inode table size dynamically. I'm afraid that you need to back-up your data, and create new filesystem, and restore your data. To create new...

Why doesn’t cp have a progress bar like wget?

The tradition in unix tools is to display messages only if something goes wrong. I think this is both for design and practical reasons. The design is intended to make it obvious when something goes wrong: you get an error message, and it's not drowned in...