Collect exit codes of parallel background processes (sub shells)

Use wait with a PID, which will:

Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for.

You'll need to save the PID of each process as you go:

echo "x" & X=$!
echo "y" & Y=$!
echo "z" & Z=$!

You can also enable job control in the script with set -m and use a %n jobspec, but you almost certainly don't want to - job control has a lot of other side effects.

wait will return the same code as the process finished with. You can use wait $X at any (reasonable) later point to access the final code as $? or simply use it as true/false:

echo "x" & X=$!
echo "y" & Y=$!
...
wait $X
echo "job X returned $?"

wait will pause until the command completes if it hasn't already.

If you want to avoid stalling like that, you can set a trap on SIGCHLD, count the number of terminations, and handle all the waits at once when they've all finished. You can probably get away with using wait alone almost all the time.


The answer by Alexander Mills which uses handleJobs gave me a great starting point, but also gave me this error

warning: run_pending_traps: bad value in trap_list[17]: 0x461010

Which may be a bash race-condition problem

Instead I did just store pid of each child and wait and gets exit code for each child specifically. I find this cleaner in terms of subprocesses spawning subprocesses in functions and avoiding the risk of waiting for a parent process where I meant to wait for child. Its clearer what happens because its not using the trap.

#!/usr/bin/env bash

# it seems it does not work well if using echo for function return value, and calling inside $() (is a subprocess spawned?) 
function wait_and_get_exit_codes() {
    children=("$@")
    EXIT_CODE=0
    for job in "${children[@]}"; do
       echo "PID => ${job}"
       CODE=0;
       wait ${job} || CODE=$?
       if [[ "${CODE}" != "0" ]]; then
           echo "At least one test failed with exit code => ${CODE}" ;
           EXIT_CODE=1;
       fi
   done
}

DIRN=$(dirname "$0");

commands=(
    "{ echo 'a'; exit 1; }"
    "{ echo 'b'; exit 0; }"
    "{ echo 'c'; exit 2; }"
    )

clen=`expr "${#commands[@]}" - 1` # get length of commands - 1

children_pids=()
for i in `seq 0 "$clen"`; do
    (echo "${commands[$i]}" | bash) &   # run the command via bash in subshell
    children_pids+=("$!")
    echo "$i ith command has been issued as a background job"
done
# wait; # wait for all subshells to finish - its still valid to wait for all jobs to finish, before processing any exit-codes if we wanted to
#EXIT_CODE=0;  # exit code of overall script
wait_and_get_exit_codes "${children_pids[@]}"

echo "EXIT_CODE => $EXIT_CODE"
exit "$EXIT_CODE"
# end

If you had a good way to identify the commands, you could print their exit code to a tmp file and then access the specific file you're interested in:

#!/bin/bash

for i in `seq 1 5`; do
    ( sleep $i ; echo $? > /tmp/cmd__${i} ) &
done

wait

for i in `seq 1 5`; do # or even /tmp/cmd__*
    echo "process $i:"
    cat /tmp/cmd__${i}
done

Don't forget to remove the tmp files.