How do I quickly stop a process that is causing thrashing (due to excess memory allocation)?

Press Alt-SysRq-F to kill the process using the most memory:

  • The SysRq key is usually mapped to the Print key.
  • If you're using a graphical desktop you might need to press Ctrl-Alt-SysRq-F in case pressing Alt-SysRq triggers another action (e.g. snapshot program).
  • If you're using a laptop you might need to press a function key too.
  • For more information read the wikipedia article.

I've made a script for this purpose - https://github.com/tobixen/thrash-protect

I've had this script running on production servers, workstations and laptops with good success. This script does not kill processes, but suspends them temporary - I've had several situations later where I'm quite sure I'd lost control due to thrashing if it wasn't for this simple script. In "worst" case the offending process will be slowed down a lot and in the end be killed by the kernel (OOM), in the "best" case the offending process will actually complete ... in any case, the server or workstation will remain relatively responsive so that it's easy to investigate the situation.

Of course, "buy more memory" or "don't use swap" are two alternative, more traditional answers on the question "how to avoid thrashing?", but in general they tend not to work out so well (installing more memory may be non-trivial, a rogue process can eat up all memory no matter how much one has installed, and one can get into thrashing-problems even without swap when there aren't enough memory for buffering/caching). I do recommend thrash-protect plus lots of swap space.


  1. What is the quickest way to regain control of a Linux system that has become nonresponsive or extremely sluggish due to excessive swapping?

Already answered above with Alt-SysRq-F

  1. Is there an effective way to prevent such swapping from occurring in the first place, for instance by limiting the amount of memory a process is allowed to try to allocate?

I'm answering this 2nd part. Yes, ulimit still works well enough to limit a single process. You can:

  • set a soft limit for a process you know will likely go out of control
  • set a hard limit for all processes if you want extra insurance

Also, as briefly mentioned:

You can use CGroups to limit resource usage and prevent such problems

Indeed, cgroups offer more advanced control, but are currently more complicated to configure in my opinion.

Old school ulimit

Once off

Heres a simple example:

$ bash
$ ulimit -S -v $((1*2**20))
$ r2(){r2 $@$@;};r2 r2
bash: xmalloc: .././subst.c:3550: cannot allocate 134217729 bytes (946343936 bytes allocated)

It:

  • Sets a soft limit of 1GB overall memory use (ulimit assumes limit in kB unit)
  • Runs a recursive bash function call r2(){ r2 $@$@;};r2 r2 that will exponentially chew up CPU and RAM by infinitely doubling itself while requesting stack memory.

As you can see, it got stopped when trying to request more than 1GB.

Note, -v operates on virtual memory allocation (total, i.e. physical + swap).

Permanent protection

To limit virtual memory allocation, as is the equivalent of -v for limits.conf.

I do the following to protect against any single misbehaving process:

  • Set a hard address space limit for all processes.
  • address space limit = <physical memory> - 256MB.
  • Therefore, no single process with greedy memory use or an active loop and memory leak can consume ALL the physical memory.
  • 256MB headroom is there for essential processing with ssh or a console.

One liner:

$ sudo bash -c "echo -e \"*\thard\tas\t$(($(grep -E 'MemTotal' /proc/meminfo | grep -oP '(?<=\s)\d+(?=\skB$)') - 256*2**10))\" > /etc/security/limits.d/mem.conf"

To validate, this results in the following (e.g. on 16GB system):

$ cat /etc/security/limits.d/mem.conf
*   hard    as      16135196
$ ulimit -H -v
161351960

Notes:

  • Only mitigates against a single process going overboard with memory use.
  • Won't prevent a multi-process workload with heavy memory pressure causing thrashing (cgroups is then the answer).
  • Don't use rss option in limits.conf. It's not respected by newer kernels.
  • It's conservative.
    • In theory, a process can speculatively request lots of memory but only actively use a subset (smaller working set/resident memory use).
    • The above hard limit will cause such processes to abort (even if they might have otherwise run fine given Linux allows the virtual memory address space to be overcommitted).

Newer CGroups

Offers more control, but currently more complex to use:

  • Improves on ulimit offering.
    • memory.max_usage_in_bytes can account and limit physical memory separately.
    • Whereas ulimit -m and/or rss in limits.conf was meant to offer similar functionality, but that doesn't work since kernel Linux 2.4.30!
  • Need to enable some kernel cgroup flags in bootloader: cgroup_enable=memory swapaccount=1.
    • This didn't happen by default with Ubuntu 16.04.
    • Probably due to some performance implications of extra accounting overhead.
  • cgroup/systemd stuff is relatively new and changing a fair bit, so the flux upstream implies Linux distro vendors haven't yet made it easy to use. Between 14.04LTS and 16.04LTS, the user space tooling to use cgroups has changed.
    • cgm now seems to be the officially supported userspace tool.
    • systemd unit files don't yet seem to have any pre-defined"vendor/distro" defaults to prioritise important services like ssh.

E.g. to check current settings:

$ echo $(($(cat /sys/fs/cgroup/memory/memory.max_usage_in_bytes) / 2**20)) MB
11389 MB
$ cat /sys/fs/cgroup/memory/memory.stat
...

E.g. to limit the memory of a single process:

$ cgm create memory mem_1G
$ cgm setvalue memory mem_1G memory.limit_in_bytes $((1*2**30))
$ cgm setvalue memory mem_1G memory.memsw.limit_in_bytes $((1*2**30))
$ bash
$ cgm movepid memory mem_1G $$
$ r2(){ r2 $@$@;};r2 r2
Killed

To see it in action chewing up RAM as a background process and then getting killed:

$ bash -c 'cgm movepid memory mem_1G $$; r2(){ r2 $@$@;};r2 r2' & while [ -e /proc/$! ]; do ps -p $! -o pcpu,pmem,rss h; sleep 1; done
[1] 3201
 0.0  0.0  2876
 102  0.2 44056
 103  0.5 85024
 103  1.0 166944
 ...
98.9  5.6 920552
99.1  4.3 718196
[1]+  Killed                  bash -c 'cgm movepid memory mem_1G $$; r2(){ r2 $@$@;};r2 r2'

Note the exponential (power of 2) growth in memory requests.

In the future, let's hope to see "distro/vendors" pre-configure cgroup priorities and limits (via systemd units) for important things like SSH and the graphical stack, such that they never get starved of memory.