Can someone explain in detail what "set -m" does?

Quoting the bash documentation (from man bash):

JOB CONTROL
       Job  control  refers to  the  ability  to selectively  stop
       (suspend) the execution of  processes and continue (resume)
       their execution at a later point.  A user typically employs
       this facility via an interactive interface supplied jointly
       by the operating system kernel's terminal driver and bash.

So, quite simply said, having set -m (the default for interactive shells) allows one to use built-ins such as fg and bg, which would be disabled under set +m (the default for non-interactive shells).

It's not obvious to me what the connection is between job control and killing background processes on exit, however, but I can confirm that there is one: running set -m; (sleep 10 ; touch control-on) & will create the file if one quits the shell right after typing that command, but set +m; (sleep 10 ; touch control-off) & will not.

I think the answer lies in the rest of the documentation for set -m:

-m      Monitor  mode. [...]                     Background pro‐
        cesses run in a separate process group and a  line  con‐
        taining  their exit status is printed upon their comple‐
        tion.

This means that background jobs started under set +m are not actual "background processes" ("Background processes are those whose process group ID differs from the terminal's"): they share the same process group ID as the shell that started them, rather than having their own process group like proper background processes. This explains the behavior observed when the shell quits before some of its background jobs: if I understand correctly, when quitting, a signal is sent to the processes in the same process group as the shell (thus killing background jobs started under set +m), but not to those of other process groups (thus leaving alone true background processes started under set -m).

So, in your case, the startup.sh script presumably starts a background job. When this script is run non-interactively, such as over SSH as in the question you linked to, job control is disabled, the "background" job shares the process group of the remote shell, and is thus killed as soon that shell exits. Conversely, by enabling job control in that shell, the background job acquires its own process group, and isn't killed when its parent shell exits.

I'v found this at github issue list, and I think this really answer your question.

It's not really a SSH problem, it's more the subtle behaviour around BASH non-interactive/interactive modes and signal propagation to process groups.

Following is based on https://stackoverflow.com/questions/14679178/why-does-ssh-wait-for-my-subshells-without-t-and-kill-them-with-t/14866774#14866774 and http://www.itp.uzh.ch/~dpotter/howto/daemonize, with some assumptions not fully validated, but tests about how this works seem to confirm.

pty/tty = false

The bash shell launched connects to the stdout/stderr/stdin of the started process and is kept running until there is nothing attached to the sockets and it's children have exited. A good deamon process will ensure it doesn't wait for it's children to exit, fork a child process and then exit. When in this mode no SIGHUP will be sent to the child process by SSH. I believe this will work correctly for most scripts executing a process that handles deamonizing itself and doesn't need to be backgrounded. Where init scripts use '&' to background a process then it's likely that the main problem will be whether the backgrounded process ever attempts to read from stdin since that will trigger a SIGHUP if the session has been terminated.

pty/tty = true*

If the init script backgrounds the process started, the parent BASH shell will return an exit code to the SSH connection, which will in turn look to exit immediately since it isn't waiting on a child process to terminate and isn't blocked on stdout/stderr/stdin. This will cause a SIGHUP to be sent to the parent bash shell process group, which since job control is disabled in non-interactive mode in bash, will include the child processes just launched. Where a daemon process explicitly starts a new process session when forking or in the forked process then it or it's children won't receive the SIGHUP from the BASH parent process exiting. Note this is different from suspended jobs which will see a SIGTERM. I suspect the problems around this only working sometimes has to do with a slight race condition. If you look at the standard approach to deamonizing - http://www.itp.uzh.ch/~dpotter/howto/daemonize, you'll see that in the code the new session is created by the forked process which may not be run before the parent exits, thus resulting the random sucess/failure behaviour mentioned above. A sleep statement will allow enough time for the forked process to have created a new session, which is why it works for some cases.

pty/tty = true and job control is explicitly enabled in bash

SSH won't connect to the stdout/stderr/stdin of the bash shell or any launched child processes, which will mean it will exit as soon as the parent bash shell started finished executing the requested commands. In this case, with job control explicitly enabled, any processes launched by the bash shell with '&' to background them will be placed into a separate session immediately and will not receive the SIGHUP signal when the the parent process to the BASH session exits (SSH connection in this case).

What's needed to fix

I think the solutions just need to be explicitly mentioned in the run/sudo operations documentation as a special case when working with background processes/services. Basically either use 'pty=false', or where that is not possible, explicitly enable job control as the first command, and the behaviour will be correct.

From https://github.com/fabric/fabric/issues/395

Can someone explain in detail what "set -m" does?

Tags:

Linux

Job Control

Related

Recent Posts