What if 'kill -9' does not work?

kill -9 (SIGKILL) always works, provided you have the permission to kill the process. Basically either the process must be started by you and not be setuid or setgid, or you must be root. There is one exception: even root cannot send a fatal signal to PID 1 (the init process).

However kill -9 is not guaranteed to work immediately. All signals, including SIGKILL, are delivered asynchronously: the kernel may take its time to deliver them. Usually, delivering a signal takes at most a few microseconds, just the time it takes for the target to get a time slice. However, if the target has blocked the signal, the signal will be queued until the target unblocks it.

Normally, processes cannot block SIGKILL. But kernel code can, and processes execute kernel code when they call system calls. Kernel code blocks all signals when interrupting the system call would result in a badly formed data structure somewhere in the kernel, or more generally in some kernel invariant being violated. So if (due to a bug or misdesign) a system call blocks indefinitely, there may effectively be no way to kill the process. (But the process will be killed if it ever completes the system call.)

A process blocked in a system call is in uninterruptible sleep. The ps or top command will (on most unices) show it in state D (originally for “disk”, I think).

A classical case of long uninterruptible sleep is processes accessing files over NFS when the server is not responding; modern implementations tend not to impose uninterruptible sleep (e.g. under Linux, the intr mount option allows a signal to interrupt NFS file accesses).

You may sometimes see entries marked Z (or H under Linux, I don't know what the distinction is) in the ps or top output. These are technically not processes, they are zombie processes, which are nothing more than an entry in the process table, kept around so that the parent process can be notified of the death of its child. They will go away when the parent process pays attention (or dies).


Sometime process exists and cannot be killed due to:

  • being zombie. I.e. process which parent did not read the exit status. Such process does not consume any resources except PID entry. In top it is signaled Z
  • erroneous uninterruptible sleep. It should not happen but with a combination of buggy kernel code and/or buggy hardware it sometime does. The only method is to reboot or wait. In top it is signaled by D.

It sounds like you might have a zombie process. This is harmless: the only resource a zombie process consumes is an entry in the process table. It will go away when the parent process dies or reacts to the death of its child.

You can see if the process is a zombie by using top or the following command:

ps aux | awk '$8=="Z" {print $2}'

Tags:

Process

Kill