Why would anyone choose not use the lowlatency kernel?

The different configurations, “generic”, “lowlatency” (as configured in Ubuntu), and RT, are all about balancing throughput versus latency. Generic kernels favour throughput over latency, the others favour latency over throughput. Thus users who need throughput more than they need low latency wouldn’t choose a low latency kernel.

Compared to the generic configuration, the low-latency kernel changes the following settings:

  • IRQs are threaded by default, meaning that more IRQs (still not all IRQs) can be pre-empted, and they can also be prioritised and have their CPU affinity controlled;
  • pre-emption is enabled throughout the kernel (CONFIG_PREEMPT instead of CONFIG_PREEMPT_VOLUNTARY);
  • the latency debugging tools are enabled, so that the user can determine what kernel operations are blocking progress;
  • the timer frequency is set to 1000 Hz instead of 250 Hz.

RT kernels add a number of patches to the mainline kernel, and a few more configuration tweaks. The purpose of most of those patches is to allow more opportunities for pre-emption, by removing or splitting up locks, and to reduce the amount of time the kernel spends handling uninterruptible tasks (notably, by improving the logging mechanisms and using them less). The goal of all this is to allow the kernel to meet deadlines, i.e. ensure that, when it is required to handle something, it isn’t busy doing something else; this isn’t the same as high throughput or low latency, but fixing latency issues helps.

The generic kernels, as configured by default in most distributions, are designed to be a “sensible” compromise: they try to ensure that no single task can monopolise the system for too long, and that tasks can switch reasonably frequently, but without compromising throughput — because the more time the kernel spends considering whether to switch tasks (inside or outside the kernel), or handling interrupts, the less time the system as a whole can spend “working”. That compromise isn’t good enough for latency-sensitive workloads such as real-time audio or video processing: for those, low-latency kernels provide lower latencies at the expense of some throughput. And for real-time requirements, the real-time kernels remove as many low-latency-blockers as possible, at the expense of more throughput.

Main-stream distributions of Linux are mostly installed on servers, where traditionally latency hasn’t been considered all that important (although if you do percentile performance analysis, and care about top percentile performance, you might disagree), so the default kernels are quite conservative. Desktop users should probably use the low-latency kernels, as suggested by the kernel’s own documentation. In fact, the more low-latency kernels are used, the more feedback there will be on their relevance, which helps get generally-applicable improvements into the default kernel configurations; the same goes for the RT kernels (many of the RT patches are intended, at some point, for the mainstream kernel).

This presentation on the topic provides quite a lot of background.

Stephen Kitt explained the configurations and balancy and all nice in technical parameters. I would like to offer just a small intuitive distinction:

  • You are on safari, riding through the terrain in a jeep. Your prey is running. When the prey is in crosshairs, you pull the trigger and the rifle shoots - the computation is simple - prey in crosshairs=hit, prey not in crosshairs=miss - you desperately need low latency - you then recover, reload rifle, find another prey - no need for extra speed, no need for regularity. Latency is all.

  • You are converting a video from that safari. It is long, it takes hours. You do not care, when particular frame is processed and if some frames take more time to process than others. You need finish the process as fast as possible - better throughput means less hours, nothing else matters

  • You are getting telegraph - just shorts, longs and spaces - Morse is easy to decipher and you do not need exactly when each pulse started or ended, but you need a guarantee, that you do not miss any one of them - you need realtime (it may be slow, telegraph is not so fast, but it must be regular

In this three examples you clearly select just one of latency, throughput or regularity, sacrificing other two - for obvious reasons. And only in one you really want low latency, if you cannot have all three at the same time.

Because there is a trade-off. Switching processes or entering/exiting interrupts takes time. For example running the scheduler at 1000Hz instead of 250Hz means you’ll have a timer interrupt and potentially switch processes four times as often. This can allow a process to react faster because it’s more regularly allowed to execute. However, as a human you’ll probably not notice any difference (250Hz means every 4ms which is already much faster than any human reaction time).

The total amount of processing power or I/O throughput is limited and calling the scheduler more often only means you’ll waste a part of it.