Home » How do you use the command coproc in various shells?

How do you use the command coproc in various shells?

Solutons:


co-processes are a ksh feature (already in ksh88). zsh has had the feature from the start (early 90s), while it has just only been added to bash in 4.0 (2009).

However, the behaviour and interface is significantly different between the 3 shells.

The idea is the same, though: it allows to start a job in background and being able to send it input and read its output without having to resort to named pipes.

That is done with unnamed pipes with most shells and socketpairs with recent versions of ksh93 on some systems.

In a | cmd | b, a feeds data to cmd and b reads its output. Running cmd as a co-process allows the shell to be both a and b.

ksh co-processes

In ksh, you start a coprocess as:

cmd |&

You feed data to cmd by doing things like:

echo test >&p

or

print -p test

And read cmd‘s output with things like:

read var <&p

or

read -p var

cmd is started as any background job, You can use fg, bg, kill on it and refer it by %job-number or via $!.

To close the writing end of the pipe cmd is reading from, you can do:

exec 3>&p 3>&-

And to close the reading end of the other pipe (the one cmd is writing to):

exec 3<&p 3<&-

You cannot start a second co-process unless you first save the pipe file descriptors to some other fds. For instance:

tr a b |&
exec 3>&p 4<&p
tr b c |&
echo aaa >&3
echo bbb >&p

zsh co-processes

In zsh, co-processes are nearly identical to those in ksh. The only real difference is that zsh co-processes are started with the coproc keyword.

coproc cmd
echo test >&p
read var <&p
print -p test
read -p var

Doing:

exec 3>&p

Note: This doesn’t move the coproc file descriptor to fd 3 (like in ksh), but duplicates it. So, there’s no explicit way to close the feeding or reading pipe, other starting another coproc.

For instance, to close the feeding end:

coproc tr a b
echo aaaa >&p # send some data

exec 4<&p     # preserve the reading end on fd 4
coproc :      # start a new short-lived coproc (runs the null command)

cat <&4       # read the output of the first coproc

In addition to pipe based co-processes, zsh (since 3.1.6-dev19, released in 2000) has pseudo-tty based constructs like expect. To interact with most programs, ksh-style co-processes won’t work, since programs start buffering when their output is a pipe.

Here are some examples.

Start the co-process x:

zmodload zsh/zpty
zpty x cmd

(Here, cmd is a simple command. But you can do fancier things with eval or functions.)

Feed a co-process data:

zpty -w x some data

Read co-process data (in the simplest case):

zpty -r x var

Like expect, it can wait for some output from the co-process matching a given pattern.

bash co-processes

The bash syntax is a lot newer, and builds on top of a new feature recently added to ksh93, bash, and zsh that provides a syntax to allow handling of dynamically-allocated file descriptors above 10.

bash offers a basic coproc syntax, and an extended one.

Basic syntax

The basic syntax for starting a co-process looks like zsh‘s:

coproc cmd

In ksh or zsh, the pipes to and from the co-process are accessed with >&p and <&p.

But in bash, the file descriptors of the pipe from the co-process and the other pipe to the co-proccess are returned in the $COPROC array (respectively ${COPROC[0]} and ${COPROC[1]}. So…

Feed data to the co-process:

echo xxx >&"${COPROC[1]}"

Read data from the co-process:

read var <&"${COPROC[0]}"

With the basic syntax, you can start only one co-process at the time.

Extended syntax

In the extended syntax, you can name your co-processes (like in zsh zpty co-proccesses):

coproc mycoproc { cmd; }

The command has to be a compound command. (Notice how the example above is reminiscent of function f { ...; }.)

This time, the file descriptors are in ${mycoproc[0]} and ${mycoproc[1]}.

You can start more than one co-process at a time—but you do get a warning when you start a co-process while one is still running (even in non-interactive mode).

You can close the file descriptors when using the extended syntax.

coproc tr { tr a b; }
echo aaa >&"${tr[1]}"

exec {tr[1]}>&-

cat <&"${tr[0]}"

Note that closing that way doesn’t work in bash versions prior to 4.3 where you have to write it instead:

fd=${tr[1]}
exec {fd}>&-

As in ksh and zsh, those pipe file descriptors are marked as close-on-exec.

But in bash, the only way to pass those to executed commands is to duplicate them to fds 0, 1, or 2. That limits the number of co-processes you can interact with for a single command. (See below for an example.)

yash process and pipeline redirection

yash doesn’t have a co-process feature per se, but the same concept can be implemented with its pipeline and process redirection features. yash has an interface to the pipe() system call, so this kind of thing can be done relatively easily by hand there.

You’d start a co-process with:

exec 5>>|4 3>(cmd >&5 4<&- 5>&-) 5>&-

Which first creates a pipe(4,5) (5 the writing end, 4 the reading end), then redirects fd 3 to a pipe to a process that runs with its stdin at the other end, and stdout going to the pipe created earlier. Then we close the writing end of that pipe in the parent which we won’t need. So now in the shell we have fd 3 connected to the cmd’s stdin and fd 4 connected to cmd’s stdout with pipes.

Note that the close-on-exec flag is not set on those file descriptors.

To feed data:

echo data >&3 4<&-

To read data:

read var <&4 3>&-

And you can close fds as usual:

exec 3>&- 4<&-

hardly any benefit over using named pipes

Co-processes can easily be implemented with standard named pipes. I don’t know when exactly named pipes were introduced but it’s possible it was after ksh came up with co-processes (probably in the mid 80s, ksh88 was “released” in 88, but I believe ksh was used internally at AT&T a few years before that) which would explain why.

cmd |&
echo data >&p
read var <&p

Can be written with:

mkfifo in out

cmd <in >out &
exec 3> in 4< out
echo data >&3
read var <&4

Interacting with those is more straightforward—especially if you need to run more than one co-process. (See examples below.)

The only benefit of using coproc is that you don’t have to clean up of those named pipes after use.

deadlock-prone

Shells use pipes in a few constructs:

  • shell pipes: cmd1 | cmd2,
  • command substitution: $(cmd),
  • and process substitution: <(cmd), >(cmd).

In those, the data flows in only one direction between different processes.

With co-processes and named pipes, though, it’s easy to run into deadlock. You have to keep track of which command has which file descriptor open, to prevent one staying open and holding a process alive. Deadlocks can be tricky to investigate, because they may occur non-deterministically; for instance, only when as much data as to fill one pipe up is sent.

works worse than expect for what it’s been designed for

The main purpose of co-processes was to provide the shell with a way to interact with commands. However, it does not work so well.

The simplest form of deadlock mentioned above is:

tr a b |&
echo a >&p
read var<&p

Because its output doesn’t go to a terminal, tr buffers its output. So it won’t output anything until either it sees end-of-file on its stdin, or it has accumulated a buffer-full of data to output. So above, after the shell has output an (only 2 bytes), the read will block indefinitely because tr is waiting for the shell to send it more data.

In short, pipes aren’t good for interacting with commands. Co-processes can only be used to interact with commands that don’t buffer their output, or commands which can be told not to buffer their output; for example, by using stdbuf with some commands on recent GNU or FreeBSD systems.

That’s why expect or zpty use pseudo-terminals instead. expect is a tool designed for interacting with commands, and it does it well.

File descriptor handling is fiddly, and hard to get right

Co-processes can be used to do some more complex plumbing than what simple shell pipes allow.

that other Unix.SE answer has an example of a coproc usage.

Here’s a simplified example: Imagine you want a function that feeds a copy of a command’s output to 3 other commands, and then have the output of those 3 commands get concatenated.

All using pipes.

For instance: feed the output of printf '%sn' foo bar to tr a b, sed 's/./&&/g', and cut -b2- to obtain something like:

foo
bbr
ffoooo
bbaarr
oo
ar

First, it’s not necessarily obvious, but there’s a possibility for deadlock there, and it will start to happen after only a few kilobytes of data.

Then, depending on your shell, you’ll run in a number of different problems that have to be addressed differently.

For instance, with zsh, you’d do it with:

f() (
  coproc tr a b
  exec {o1}<&p {i1}>&p
  coproc sed 's/./&&/g' {i1}>&- {o1}<&-
  exec {o2}<&p {i2}>&p
  coproc cut -c2- {i1}>&- {o1}<&- {i2}>&- {o2}<&-
  tee /dev/fd/$i1 /dev/fd/$i2 >&p {o1}<&- {o2}<&- &
  exec cat /dev/fd/$o1 /dev/fd/$o2 - <&p {i1}>&- {i2}>&-
)
printf '%sn' foo bar | f

Above, the co-process fds have the close-on-exec flag set, but not the ones that are duplicated from them (as in {o1}<&p). So, to avoid deadlocks, you’ll have to make sure they’re closed in any processes that don’t need them.

Similarly, we have to use a subshell and use exec cat in the end, to ensure there’s no shell process lying about holding a pipe open.

With ksh (here ksh93), that would have to be:

f() (
  tr a b |&
  exec {o1}<&p {i1}>&p
  sed 's/./&&/g' |&
  exec {o2}<&p {i2}>&p
  cut -c2- |&
  exec {o3}<&p {i3}>&p
  eval 'tee "/dev/fd/$i1" "/dev/fd/$i2"' >&"$i3" {i1}>&"$i1" {i2}>&"$i2" &
  eval 'exec cat "/dev/fd/$o1" "/dev/fd/$o2" -' <&"$o3" {o1}<&"$o1" {o2}<&"$o2"
)
printf '%sn' foo bar | f

(Note: That won’t work on systems where ksh uses socketpairs instead of pipes, and where /dev/fd/n works like on Linux.)

In ksh, fds above 2 are marked with the close-on-exec flag, unless they’re passed explicitly on the command line. That’s why we don’t have to close the unused file descriptors like with zsh—but it’s also why we have to do {i1}>&$i1 and use eval for that new value of $i1, to be passed to tee and cat

In bash this cannot be done, because you can’t avoid the close-on-exec flag.

Above, it’s relatively simple, because we use only simple external commands. It gets more complicated when you want to use shell constructs in there instead, and you start running into shell bugs.

Compare the above with the same using named pipes:

f() {
  mkfifo p{i,o}{1,2,3}
  tr a b < pi1 > po1 &
  sed 's/./&&/g' < pi2 > po2 &
  cut -c2- < pi3 > po3 &

  tee pi{1,2} > pi3 &
  cat po{1,2,3}
  rm -f p{i,o}{1,2,3}
}
printf '%sn' foo bar | f

Conclusion

If you want to interact with a command, use expect, or zsh‘s zpty, or named pipes.

If you want to do some fancy plumbing with pipes, use named pipes.

Co-processes can do some of the above, but be prepared to do some serious head scratching for anything non-trivial.

Co-processes were first introduced in a shell scripting language with the ksh88 shell (1988), and later in zsh at some point before 1993.

The syntax to launch a co-process under ksh is command |&. Starting from there, you can write to command standard input with print -p and read its standard output with read -p.

More than a couple of decades later, bash which was lacking this feature finally introduced it in its 4.0 release. Unfortunately, an incompatible and more complex syntax was selected.

Under bash 4.0 and newer, you can launch a co-process with the coproc command, eg:

$ coproc awk '{print $2;fflush();}'

You can then pass something to the command stdin that way:

$ echo one two three >&${COPROC[1]}

and read awk output with:

$ read -ru ${COPROC[0]} foo
$ echo $foo
two

Under ksh, that would have been:

$ awk '{print $2;fflush();}' |&
$ print -p "one two three"
$ read -p foo
$ echo $foo
two

Here is another good (and working) example — a simple server written in BASH. Please note that you would need OpenBSD’s netcat, the classic one won’t work. Of course you could use inet socket instead of unix one.

server.sh:

#!/usr/bin/env bash

SOCKET=server.sock
PIDFILE=server.pid

(
    exec </dev/null
    exec >/dev/null
    exec 2>/dev/null
    coproc SERVER {
        exec nc -l -k -U $SOCKET
    }
    echo $SERVER_PID > $PIDFILE
    {
        while read ; do
            echo "pong $REPLY"
        done
    } <&${SERVER[0]} >&${SERVER[1]}
    rm -f $PIDFILE
    rm -f $SOCKET
) &
disown $!

client.sh:

#!/usr/bin/env bash

SOCKET=server.sock

coproc CLIENT {
    exec nc -U $SOCKET
}

{
    echo "$@"
    read
} <&${CLIENT[0]} >&${CLIENT[1]}

echo $REPLY

Usage:

$ ./server.sh
$ ./client.sh ping
pong ping
$ ./client.sh 12345
pong 12345
$ kill $(cat server.pid)
$

Related Solutions

Joining bash arguments into single string with spaces

[*] I believe that this does what you want. It will put all the arguments in one string, separated by spaces, with single quotes around all: str="'$*'" $* produces all the scripts arguments separated by the first character of $IFS which, by default, is a space....

AddTransient, AddScoped and AddSingleton Services Differences

TL;DR Transient objects are always different; a new instance is provided to every controller and every service. Scoped objects are the same within a request, but different across different requests. Singleton objects are the same for every object and every...

How to download package not install it with apt-get command?

Use --download-only: sudo apt-get install --download-only pppoe This will download pppoe and any dependencies you need, and place them in /var/cache/apt/archives. That way a subsequent apt-get install pppoe will be able to complete without any extra downloads....

What defines the maximum size for a command single argument?

Answers Definitely not a bug. The parameter which defines the maximum size for one argument is MAX_ARG_STRLEN. There is no documentation for this parameter other than the comments in binfmts.h: /* * These are the maximum length and maximum number of strings...

Bulk rename, change prefix

I'd say the simplest it to just use the rename command which is common on many Linux distributions. There are two common versions of this command so check its man page to find which one you have: ## rename from Perl (common in Debian systems -- Ubuntu, Mint,...

Output from ls has newlines but displays on a single line. Why?

When you pipe the output, ls acts differently. This fact is hidden away in the info documentation: If standard output is a terminal, the output is in columns (sorted vertically) and control characters are output as question marks; otherwise, the output is...

mv: Move file only if destination does not exist

mv -vn file1 file2. This command will do what you want. You can skip -v if you want. -v makes it verbose - mv will tell you that it moved file if it moves it(useful, since there is possibility that file will not be moved) -n moves only if file2 does not exist....

Is it possible to store and query JSON in SQLite?

SQLite 3.9 introduced a new extension (JSON1) that allows you to easily work with JSON data . Also, it introduced support for indexes on expressions, which (in my understanding) should allow you to define indexes on your JSON data as well. PostgreSQL has some...

Combining tail && journalctl

You could use: journalctl -u service-name -f -f, --follow Show only the most recent journal entries, and continuously print new entries as they are appended to the journal. Here I've added "service-name" to distinguish this answer from others; you substitute...

how can shellshock be exploited over SSH?

One example where this can be exploited is on servers with an authorized_keys forced command. When adding an entry to ~/.ssh/authorized_keys, you can prefix the line with command="foo" to force foo to be run any time that ssh public key is used. With this...

Why doesn’t the tilde (~) expand inside double quotes?

The reason, because inside double quotes, tilde ~ has no special meaning, it's treated as literal. POSIX defines Double-Quotes as: Enclosing characters in double-quotes ( "" ) shall preserve the literal value of all characters within the double-quotes, with the...

What is GNU Info for?

GNU Info was designed to offer documentation that was comprehensive, hyperlinked, and possible to output to multiple formats. Man pages were available, and they were great at providing printed output. However, they were designed such that each man page had a...

Set systemd service to execute after fstab mount

a CIFS network location is mounted via /etc/fstab to /mnt/ on boot-up. No, it is not. Get this right, and the rest falls into place naturally. The mount is handled by a (generated) systemd mount unit that will be named something like mnt-wibble.mount. You can...

Merge two video clips into one, placing them next to each other

To be honest, using the accepted answer resulted in a lot of dropped frames for me. However, using the hstack filter_complex produced perfectly fluid output: ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack output.mp4 ffmpeg -i input1.mp4 -i input2.mp4...

How portable are /dev/stdin, /dev/stdout and /dev/stderr?

It's been available on Linux back into its prehistory. It is not POSIX, although many actual shells (including AT&T ksh and bash) will simulate it if it's not present in the OS; note that this simulation only works at the shell level (i.e. redirection or...

How can I increase the number of inodes in an ext4 filesystem?

It seems that you have a lot more files than normal expectation. I don't know whether there is a solution to change the inode table size dynamically. I'm afraid that you need to back-up your data, and create new filesystem, and restore your data. To create new...

Why doesn’t cp have a progress bar like wget?

The tradition in unix tools is to display messages only if something goes wrong. I think this is both for design and practical reasons. The design is intended to make it obvious when something goes wrong: you get an error message, and it's not drowned in...