How do you use the command coproc in various shells?

co-processes are a ksh feature (already in ksh88). zsh has had the feature from the start (early 90s), while it has just only been added to bash in 4.0 (2009).

However, the behaviour and interface is significantly different between the 3 shells.

The idea is the same, though: it allows to start a job in background and being able to send it input and read its output without having to resort to named pipes.

That is done with unnamed pipes with most shells and socketpairs with recent versions of ksh93 on some systems.

In a | cmd | b, a feeds data to cmd and b reads its output. Running cmd as a co-process allows the shell to be both a and b.

ksh co-processes

In ksh, you start a coprocess as:

cmd |&

You feed data to cmd by doing things like:

echo test >&p

or

print -p test

And read cmd's output with things like:

read var <&p

or

read -p var

cmd is started as any background job, You can use fg, bg, kill on it and refer it by %job-number or via $!.

To close the writing end of the pipe cmd is reading from, you can do:

exec 3>&p 3>&-

And to close the reading end of the other pipe (the one cmd is writing to):

exec 3<&p 3<&-

You cannot start a second co-process unless you first save the pipe file descriptors to some other fds. For instance:

tr a b |&
exec 3>&p 4<&p
tr b c |&
echo aaa >&3
echo bbb >&p

zsh co-processes

In zsh, co-processes are nearly identical to those in ksh. The only real difference is that zsh co-processes are started with the coproc keyword.

coproc cmd
echo test >&p
read var <&p
print -p test
read -p var

Doing:

exec 3>&p

Note: This doesn't move the coproc file descriptor to fd 3 (like in ksh), but duplicates it. So, there's no explicit way to close the feeding or reading pipe, other starting another coproc.

For instance, to close the feeding end:

coproc tr a b
echo aaaa >&p # send some data

exec 4<&p     # preserve the reading end on fd 4
coproc :      # start a new short-lived coproc (runs the null command)

cat <&4       # read the output of the first coproc

In addition to pipe based co-processes, zsh (since 3.1.6-dev19, released in 2000) has pseudo-tty based constructs like expect. To interact with most programs, ksh-style co-processes won't work, since programs start buffering when their output is a pipe.

Here are some examples.

Start the co-process x:

zmodload zsh/zpty
zpty x cmd

(Here, cmd is a simple command. But you can do fancier things with eval or functions.)

Feed a co-process data:

zpty -w x some data

Read co-process data (in the simplest case):

zpty -r x var

Like expect, it can wait for some output from the co-process matching a given pattern.

bash co-processes

The bash syntax is a lot newer, and builds on top of a new feature recently added to ksh93, bash, and zsh. It provides a syntax to allow handling of dynamically-allocated file descriptors above 10.

bash offers a basic coproc syntax, and an extended one.

Basic syntax

The basic syntax for starting a co-process looks like zsh's:

coproc cmd

In ksh or zsh, the pipes to and from the co-process are accessed with >&p and <&p.

But in bash, the file descriptors of the pipe from the co-process and the other pipe to the co-proccess are returned in the $COPROC array (respectively ${COPROC[0]} and ${COPROC[1]}. So…

Feed data to the co-process:

echo xxx >&"${COPROC[1]}"

Read data from the co-process:

read var <&"${COPROC[0]}"

With the basic syntax, you can start only one co-process at the time.

Extended syntax

In the extended syntax, you can name your co-processes (like in zsh zpty co-proccesses):

coproc mycoproc { cmd; }

The command has to be a compound command. (Notice how the example above is reminiscent of function f { ...; }.)

This time, the file descriptors are in ${mycoproc[0]} and ${mycoproc[1]}.

You can start more than one co-process at a time—but you do get a warning when you start a co-process while one is still running (even in non-interactive mode).

You can close the file descriptors when using the extended syntax.

coproc tr { tr a b; }
echo aaa >&"${tr[1]}"

exec {tr[1]}>&-

cat <&"${tr[0]}"

Note that closing that way doesn't work in bash versions prior to 4.3 where you have to write it instead:

fd=${tr[1]}
exec {fd}>&-

As in ksh and zsh, those pipe file descriptors are marked as close-on-exec.

But in bash, the only way to pass those to executed commands is to duplicate them to fds 0, 1, or 2. That limits the number of co-processes you can interact with for a single command. (See below for an example.)

yash process and pipeline redirection

yash doesn't have a co-process feature per se, but the same concept can be implemented with its pipeline and process redirection features. yash has an interface to the pipe() system call, so this kind of thing can be done relatively easily by hand there.

You'd start a co-process with:

exec 5>>|4 3>(cmd >&5 4<&- 5>&-) 5>&-

Which first creates a pipe(4,5) (5 the writing end, 4 the reading end), then redirects fd 3 to a pipe to a process that runs with its stdin at the other end, and stdout going to the pipe created earlier. Then we close the writing end of that pipe in the parent which we won't need. So now in the shell we have fd 3 connected to the cmd's stdin and fd 4 connected to cmd's stdout with pipes.

Note that the close-on-exec flag is not set on those file descriptors.

To feed data:

echo data >&3 4<&-

To read data:

read var <&4 3>&-

And you can close fds as usual:

exec 3>&- 4<&-

Now, why they are not so popular

hardly any benefit over using named pipes

Co-processes can easily be implemented with standard named pipes. I don't know when exactly named pipes were introduced but it's possible it was after ksh came up with co-processes (probably in the mid 80s, ksh88 was "released" in 88, but I believe ksh was used internally at AT&T a few years before that) which would explain why.

cmd |&
echo data >&p
read var <&p

Can be written with:

mkfifo in out

cmd <in >out &
exec 3> in 4< out
echo data >&3
read var <&4

Interacting with those is more straightforward—especially if you need to run more than one co-process. (See examples below.)

The only benefit of using coproc is that you don't have to clean up of those named pipes after use.

deadlock-prone

Shells use pipes in a few constructs:

  • shell pipes: cmd1 | cmd2,
  • command substitution: $(cmd),
  • and process substitution: <(cmd), >(cmd).

In those, the data flows in only one direction between different processes.

With co-processes and named pipes, though, it's easy to run into deadlock. You have to keep track of which command has which file descriptor open, to prevent one staying open and holding a process alive. Deadlocks can be tricky to investigate, because they may occur non-deterministically; for instance, only when as much data as to fill one pipe up is sent.

works worse than expect for what it's been designed for

The main purpose of co-processes was to provide the shell with a way to interact with commands. However, it does not work so well.

The simplest form of deadlock mentioned above is:

tr a b |&
echo a >&p
read var<&p

Because its output doesn't go to a terminal, tr buffers its output. So it won't output anything until either it sees end-of-file on its stdin, or it has accumulated a buffer-full of data to output. So above, after the shell has output a\n (only 2 bytes), the read will block indefinitely because tr is waiting for the shell to send it more data.

In short, pipes aren't good for interacting with commands. Co-processes can only be used to interact with commands that don't buffer their output, or commands which can be told not to buffer their output; for example, by using stdbuf with some commands on recent GNU or FreeBSD systems.

That's why expect or zpty use pseudo-terminals instead. expect is a tool designed for interacting with commands, and it does it well.

File descriptor handling is fiddly, and hard to get right

Co-processes can be used to do some more complex plumbing than what simple shell pipes allow.

that other Unix.SE answer has an example of a coproc usage.

Here's a simplified example: Imagine you want a function that feeds a copy of a command's output to 3 other commands, and then have the output of those 3 commands get concatenated.

All using pipes.

For instance: feed the output of printf '%s\n' foo bar to tr a b, sed 's/./&&/g', and cut -b2- to obtain something like:

foo
bbr
ffoooo
bbaarr
oo
ar

First, it's not necessarily obvious, but there’s a possibility for deadlock there, and it will start to happen after only a few kilobytes of data.

Then, depending on your shell, you’ll run in a number of different problems that have to be addressed differently.

For instance, with zsh, you'd do it with:

f() (
  coproc tr a b
  exec {o1}<&p {i1}>&p
  coproc sed 's/./&&/g' {i1}>&- {o1}<&-
  exec {o2}<&p {i2}>&p
  coproc cut -c2- {i1}>&- {o1}<&- {i2}>&- {o2}<&-
  tee /dev/fd/$i1 /dev/fd/$i2 >&p {o1}<&- {o2}<&- &
  exec cat /dev/fd/$o1 /dev/fd/$o2 - <&p {i1}>&- {i2}>&-
)
printf '%s\n' foo bar | f

Above, the co-process fds have the close-on-exec flag set, but not the ones that are duplicated from them (as in {o1}<&p). So, to avoid deadlocks, you’ll have to make sure they're closed in any processes that don't need them.

Similarly, we have to use a subshell and use exec cat in the end, to ensure there's no shell process lying about holding a pipe open.

With ksh (here ksh93), that would have to be:

f() (
  tr a b |&
  exec {o1}<&p {i1}>&p
  sed 's/./&&/g' |&
  exec {o2}<&p {i2}>&p
  cut -c2- |&
  exec {o3}<&p {i3}>&p
  eval 'tee "/dev/fd/$i1" "/dev/fd/$i2"' >&"$i3" {i1}>&"$i1" {i2}>&"$i2" &
  eval 'exec cat "/dev/fd/$o1" "/dev/fd/$o2" -' <&"$o3" {o1}<&"$o1" {o2}<&"$o2"
)
printf '%s\n' foo bar | f

(Note: That won’t work on systems where ksh uses socketpairs instead of pipes, and where /dev/fd/n works like on Linux.)

In ksh, fds above 2 are marked with the close-on-exec flag, unless they’re passed explicitly on the command line. That’s why we don't have to close the unused file descriptors like with zsh—but it’s also why we have to do {i1}>&$i1 and use eval for that new value of $i1, to be passed to tee and cat

In bash this cannot be done, because you can't avoid the close-on-exec flag.

Above, it's relatively simple, because we use only simple external commands. It gets more complicated when you want to use shell constructs in there instead, and you start running into shell bugs.

Compare the above with the same using named pipes:

f() {
  mkfifo p{i,o}{1,2,3}
  tr a b < pi1 > po1 &
  sed 's/./&&/g' < pi2 > po2 &
  cut -c2- < pi3 > po3 &

  tee pi{1,2} > pi3 &
  cat po{1,2,3}
  rm -f p{i,o}{1,2,3}
}
printf '%s\n' foo bar | f

Conclusion

If you want to interact with a command, use expect, or zsh's zpty, or named pipes.

If you want to do some fancy plumbing with pipes, use named pipes.

Co-processes can do some of the above, but be prepared to do some serious head scratching for anything non-trivial.


Co-processes were first introduced in a shell scripting language with the ksh88 shell (1988), and later in zsh at some point before 1993.

The syntax to launch a co-process under ksh is command |&. Starting from there, you can write to command standard input with print -p and read its standard output with read -p.

More than a couple of decades later, bash which was lacking this feature finally introduced it in its 4.0 release. Unfortunately, an incompatible and more complex syntax was selected.

Under bash 4.0 and newer, you can launch a co-process with the coproc command, eg:

$ coproc awk '{print $2;fflush();}'

You can then pass something to the command stdin that way:

$ echo one two three >&${COPROC[1]}

and read awk output with:

$ read -ru ${COPROC[0]} foo
$ echo $foo
two

Under ksh, that would have been:

$ awk '{print $2;fflush();}' |&
$ print -p "one two three"
$ read -p foo
$ echo $foo
two

Here is another good (and working) example -- a simple server written in BASH. Please note that you would need OpenBSD's netcat, the classic one won't work. Of course you could use inet socket instead of unix one.

server.sh:

#!/usr/bin/env bash

SOCKET=server.sock
PIDFILE=server.pid

(
    exec </dev/null
    exec >/dev/null
    exec 2>/dev/null
    coproc SERVER {
        exec nc -l -k -U $SOCKET
    }
    echo $SERVER_PID > $PIDFILE
    {
        while read ; do
            echo "pong $REPLY"
        done
    } <&${SERVER[0]} >&${SERVER[1]}
    rm -f $PIDFILE
    rm -f $SOCKET
) &
disown $!

client.sh:

#!/usr/bin/env bash

SOCKET=server.sock

coproc CLIENT {
    exec nc -U $SOCKET
}

{
    echo "$@"
    read
} <&${CLIENT[0]} >&${CLIENT[1]}

echo $REPLY

Usage:

$ ./server.sh
$ ./client.sh ping
pong ping
$ ./client.sh 12345
pong 12345
$ kill $(cat server.pid)
$