Elegantly get list of descendant processes

The following is somewhat simpler, and has the added advantage of ignoring numbers in the command names:

pstree -p $pid | grep -o '([0-9]\+)' | grep -o '[0-9]\+'

Or with Perl:

pstree -p $pid | perl -ne 'print "$1\n" while /\((\d+)\)/g'

We're looking for numbers within parentheses so that we don't, for example, give 2 as a child process when we run across gif2png(3012). But if the command name contains a parenthesized number, all bets are off. There's only so far text processing can take you.

So I also think that process groups are the way to go. If you'd like to have a process run in its own process group, you can use the 'pgrphack' tool from the Debian package 'daemontools':

pgrphack my_command args

Or you could again turn to Perl:

perl -e 'setpgid or die; exec { $ARGV[0] } @ARGV;' my_command args

The only caveat here is that process groups do not nest, so if some process is creating its own process groups, its subprocesses will no longer be in the group that you created.


descendent_pids() {
    pids=$(pgrep -P $1)
    echo $pids
    for pid in $pids; do
        descendent_pids $pid
    done
}

There is also the issue of correctness. Naively parsing the output of pstree is problematic for several reasons:

  • pstree displays PIDs and the ids of threads (names are shown in curly braces)
  • a command name might contain curly braces, numbers in parentheses that make reliable parsing impossible

If you have Python and the psutil package installed you can use this snippet to list all descendant processes:

pid=2235; python3 -c "import psutil
for c in psutil.Process($pid).children(True):
  print(c.pid)"

(The psutil package is e.g. installed as a dependency of the tracer command which is available on Fedora/CentOS.)

Alternatively, you can do an breadth-first traversal of the process tree in a bourne shell:

ps=2235; while [ "$ps" ]; do echo $ps; ps=$(echo $ps | xargs -n1 pgrep -P); \
  done | tail -n +2 | tr " " "\n"

For computing the transitive-closure of a pid, the tail part can be omitted.

Note that the above doesn't use recursion and also runs in ksh-88.

On Linux, one can eliminate the pgrep call and instead read the information from /proc:

ps=2235; while [ "$ps" ]; do echo $ps ; \
  ps=$(for p in $ps; do cat /proc/$p/task/$p/children; done); done \
  | tr " " "\n"' | tail -n +2

This is more efficient because we save one fork/exec for each PID and pgrep does some additional work in each call.

Tags:

Process

Ps