What is the difference between likely and unlikely calls in Kernel?

They are compiler hints for GCC. They're used in conditionals to tell the compiler if a branch is likely to be taken or not. It can help the compiler laying down the code in such a way that's optimal for the most frequent outcome.

They are used like this:

if (likely(some_condition)) {
  // the compiler will try and make the code layout optimal for the case
  // where some_condition is true, i.e. where this block is run
  most_likely_action();
} else {
  // this block is less frequently used
  corner_case();
}

It should be used with great care (i.e. based on actual branch profiling results). A wrong hint can degrade performance (obviously).

Some examples of how the code can be optimized are easily found by searching for GCC __builtin_expect. This blog post gcc optimisation: __builtin_expect for example details a disassembly with it.

The kind of optimizations that can be done is very processor-specific. The general idea is that often, processors will run code faster if it does not branch/jump all over the place. The more linear it is, and the more predictable the branches are, the faster it will run. (This is especially true for processors with deep pipelines for example.)

So the compiler will emit the code such that the most likely branch will not involve a jump if that's what the target CPU prefers, for instance.


Let's decompile to see what GCC 4.8 does with it

Without expect

#include "stdio.h"
#include "time.h"

int main() {
    /* Use time to prevent it from being optimized away. */
    int i = !time(NULL);
    if (i)
        printf("%d\n", i);
    puts("a");
    return 0;
}

Compile and decompile with GCC 4.8.2 x86_64 Linux:

gcc -c -O3 -std=gnu11 main.c
objdump -dr main.o

Output:

0000000000000000 <main>:
   0:       48 83 ec 08             sub    $0x8,%rsp
   4:       31 ff                   xor    %edi,%edi
   6:       e8 00 00 00 00          callq  b <main+0xb>
                    7: R_X86_64_PC32        time-0x4
   b:       48 85 c0                test   %rax,%rax
   e:       75 14                   jne    24 <main+0x24>
  10:       ba 01 00 00 00          mov    $0x1,%edx
  15:       be 00 00 00 00          mov    $0x0,%esi
                    16: R_X86_64_32 .rodata.str1.1
  1a:       bf 01 00 00 00          mov    $0x1,%edi
  1f:       e8 00 00 00 00          callq  24 <main+0x24>
                    20: R_X86_64_PC32       __printf_chk-0x4
  24:       bf 00 00 00 00          mov    $0x0,%edi
                    25: R_X86_64_32 .rodata.str1.1+0x4
  29:       e8 00 00 00 00          callq  2e <main+0x2e>
                    2a: R_X86_64_PC32       puts-0x4
  2e:       31 c0                   xor    %eax,%eax
  30:       48 83 c4 08             add    $0x8,%rsp
  34:       c3                      retq

The instruction order in memory was unchanged: first the printf and then puts and the retq return.

With expect

Now replace if (i) with:

if (__builtin_expect(i, 0))

and we get:

0000000000000000 <main>:
   0:       48 83 ec 08             sub    $0x8,%rsp
   4:       31 ff                   xor    %edi,%edi
   6:       e8 00 00 00 00          callq  b <main+0xb>
                    7: R_X86_64_PC32        time-0x4
   b:       48 85 c0                test   %rax,%rax
   e:       74 11                   je     21 <main+0x21>
  10:       bf 00 00 00 00          mov    $0x0,%edi
                    11: R_X86_64_32 .rodata.str1.1+0x4
  15:       e8 00 00 00 00          callq  1a <main+0x1a>
                    16: R_X86_64_PC32       puts-0x4
  1a:       31 c0                   xor    %eax,%eax
  1c:       48 83 c4 08             add    $0x8,%rsp
  20:       c3                      retq
  21:       ba 01 00 00 00          mov    $0x1,%edx
  26:       be 00 00 00 00          mov    $0x0,%esi
                    27: R_X86_64_32 .rodata.str1.1
  2b:       bf 01 00 00 00          mov    $0x1,%edi
  30:       e8 00 00 00 00          callq  35 <main+0x35>
                    31: R_X86_64_PC32       __printf_chk-0x4
  35:       eb d9                   jmp    10 <main+0x10>

The printf (compiled to __printf_chk) was moved to the very end of the function, after puts and the return to improve branch prediction as mentioned by other answers.

So it is basically the same as:

int i = !time(NULL);
if (i)
    goto printf;
puts:
puts("a");
return 0;
printf:
printf("%d\n", i);
goto puts;

This optimization was not done with -O0.

But good luck on writing an example that runs faster with __builtin_expect than without, CPUs are really smart those days. My naive attempts are here.

C++20 [[likely]] and [[unlikely]]

C++20 has standardized those C++ built-ins: https://stackoverflow.com/questions/51797959/how-to-use-c20s-likely-unlikely-attribute-in-if-else-statement They will likely (a pun!) do the same thing.

Tags:

Linux

Kernel