Limit memory usage for a single Linux process

Another way to limit this is to use Linux's control groups. This is especially useful if you want to limit a process's (or group of processes') allocation of physical memory distinctly from virtual memory. For example:

cgcreate -g memory:myGroup
echo 500M > /sys/fs/cgroup/memory/myGroup/memory.limit_in_bytes
echo 5G > /sys/fs/cgroup/memory/myGroup/memory.memsw.limit_in_bytes

will create a control group named myGroup, cap the set of processes run under myGroup up to 500 MB of physical memory with memory.limit_in_bytes and up to 5000 MB of physical and swap memory together with memory.memsw.limit_in_bytes. More info about these options can be found here:

To run a process under the control group:

cgexec -g memory:myGroup pdftoppm

Note that on a modern Ubuntu distribution this example requires installing the cgroup-bin package and editing /etc/default/grub to change GRUB_CMDLINE_LINUX_DEFAULT to:

GRUB_CMDLINE_LINUX_DEFAULT="cgroup_enable=memory swapaccount=1"

and then running sudo update-grub and rebooting to boot with the new kernel boot parameters.

If your process doesn't spawn more children that consume the most memory, you may use setrlimit function. More common user interface for that is using ulimit command of the shell:

$ ulimit -Sv 500000     # Set ~500 mb limit
$ pdftoppm ...

This will only limit "virtual" memory of your process, taking into account—and limiting—the memory the process being invoked shares with other processes, and the memory mapped but not reserved (for instance, Java's large heap). Still, virtual memory is the closest approximation for processes that grow really large, making the said errors insignificant.

If your program spawns children, and it's them which allocate memory, it becomes more complex, and you should write auxiliary scripts to run processes under your control. I wrote in my blog, why and how.

There's some problems with ulimit. Here's a useful read on the topic: Limiting time and memory consumption of a program in Linux, which lead to the timeout tool, which lets you cage a process (and its forks) by time or memory consumption.

The timeout tool requires Perl 5+ and the /proc filesystem mounted. After that you copy the tool to e.g. /usr/local/bin like so:

curl | \
  sudo tee /usr/local/bin/timeout && sudo chmod 755 /usr/local/bin/timeout

After that, you can 'cage' your process by memory consumption as in your question like so:

timeout -m 500 pdftoppm Sample.pdf

Alternatively you could use -t <seconds> and -x <hertz> to respectively limit the process by time or CPU constraints.

The way this tool works is by checking multiple times per second if the spawned process has not oversubscribed its set boundaries. This means there actually is a small window where a process could potentially be oversubscribing before timeout notices and kills the process.

A more correct approach would hence likely involve cgroups, but that is much more involved to set up, even if you'd use Docker or runC, which among things, offer a more user-friendly abstraction around cgroups.