SSH connections running in the background don't exit if multiple connections have been started by the same shell

Foreground processes and terminal access control

To understand what is going on, you need to know a little about sharing terminals. What happens when two programs try to read from the same terminal at the same time? Each input byte goes randomly to one of the programs. (Not random as in the kernel uses an RNG to decide, just random as in unpredictable in practice.) The same thing happens when two programs read from a pipe, or any other file type which is a stream of bytes being moved from one place to another (socket, character device, …), rather than a byte array where any byte can be read multiple times (regular file, block device). For example, run a shell in a terminal, figure out the name of the terminal and run cat.

$ tty
$ cat

Then from another terminal, run cat /dev/pts/18. Now type in the terminal, and watch as lines sometimes go to one of the cat processes and sometimes to the other. Lines are dispatched as a whole when the terminal is in cooked mode. If you put the terminal in raw mode then each byte would be dispatched independently.

That's messy. Surely there should be a mechanism to decide that one program gets the terminal, and the others don't. Well, there is! It triggers in typical cases, but not in the scenario I set up above. That scenario is unusual because cat /dev/pts/18 wasn't started from /dev/pts/18. It's unusual to access a terminal from a program that wasn't started inside this terminal. In the usual case, you run a shell in a terminal, and you run programs from that shell. Then the rule is that the program in the foreground gets the terminal, and programs in the background don't. This is known as terminal access control. The way it works is:

  • Each process has a controlling terminal (or doesn't have one, typically because it doesn't have any open file descriptor that's a terminal).
  • When a process tries to access its controlling terminal, if the process is not in the foreground, then the kernel blocks it. (Conditions apply. Access to other terminals is not regulated.)
  • The shell decides who is the foreground process. (Foreground process group, actually.) It calls the tcsetpgrp to let the kernel know who should be in the foreground.

This works in typical cases. Run a program in a shell, and that program gets to be the foreground process. Run a program in the background (with &), and the program doesn't get to be in the foreground. When the shell is displaying a prompt, the shell puts itself in the foreground. When you resume a suspended job with fg, the job gets to be in the foreground. With bg, it doesn't.

If a background process tries to read from the terminal, the kernel sends it a SIGTTIN signal. The default action of the signal is to suspend the process (like SIGSTOP). The parent of the process can know about this by calling waitpid with the WSTOPPED flag; when a child process receives a signal that suspends it, the waitpid call in the parent returns and lets the parent know what the signal was. This is how the shell knows to print “Stopped (tty input)”. What it's telling you is that this job is suspended due to a SIGTTIN.

Since the process is suspended, nothing will happen to it until it's resumed or killed (with a signal that the process doesn't catch, because if the process has set a signal handler, it won't run since the process is suspended). You can resume the process by sending it a SIGCONT, but that won't achieve anything if the process is reading from the terminal, it'll receive another SIGTTIN immediately. If you resume the process with fg, it goes to the foreground and so the read succeeds.

Now you understand what happens when you run cat in the background:

$ cat &
[1] + Stopped (tty input)        cat

The case of SSH

Now let's do the same thing with SSH.

$ ssh localhost sleep 999999 &
[1] + Stopped (tty input)        ssh localhost sleep 999999

Pressing Enter sometimes goes to the shell (which is in the foreground), and sometimes to the SSH process (at which point it gets stopped by SIGTTIN). Why? If ssh was reading from the terminal, it should receive SIGTTIN immediately, and if it wasn't then why does it receive SIGTTIN?

What's happening is that the SSH process calls the select system call to know when input is available on any of the files it's interested in (or if an output file is ready to receive more data). The input sources include at least the terminal and the network socket. Unlike read, select is not forbidden to background processes, and ssh doesn't receive a SIGTTIN when it calls select. The intent of select is to find out whether data is available, without disrupting anything. Ideally select would not change the system state at all, but in fact this isn't completely true. When select tells the SSH process that input is available on the terminal file descriptor, the kernel has to commit to sending input if the process calls read afterwards. (If it didn't, and the process called read, then there might be no input available at this point, so the return value from select would have been a lie.) So if the kernel decides to route some input to the SSH process, it decides by the time the select system call returns. Then SSH calls read, and at that point the kernel sees that a background process tried to read from the terminal and suspends it with SIGTTIN.

Note that you don't need to launch multiple connections to the same server. One is enough. Multiple connections merely increases the probability that the problem arises.

The solution: don't read from the terminal

If you need the SSH session to read from the terminal, run it in the foreground.

If you don't need the SSH session to read from the terminal, make sure that its input is not coming from the terminal. There are two ways to do this:

  • You can redirect the input:

    ssh … </dev/null
  • You can instruct SSH not to forward a terminal connection with -n or -f. (-n is equivalent to </dev/null; -f allows SSH itself to read from the terminal, e.g. to read a password, but the command itself won't have the terminal open.)

    ssh -n …

Note that the disconnection between the terminal and SSH has to happen on the client. The sleep process running on the server will never read from the terminal, but SSH has no way to know that. If the client receives input on standard input, it must forward it to the server, which will make the data available in a buffer in case the application ever decides to read it (and if the application calls select, it'll be informed that data is available).

You may find help in the man page:

 -n      Redirects stdin from /dev/null (actually, prevents reading from stdin).  This must be used when ssh is run in the
         background.  A common trick is to use this to run X11 programs on a remote machine.  For example, ssh -n emacs & will start an emacs on, and the X11 connection will be automatically
         forwarded over an encrypted channel.  The ssh program will be put in the background.  (This does not work if ssh
         needs to ask for a password or passphrase; see also the -f option.)

If that still doesn't help, I'd try -T (disable pseudo-tty allocation), just on a whim.