What is the difference between likely and unlikely calls in Kernel?
They are compiler hints for GCC. They're used in conditionals to tell the compiler if a branch is likely to be taken or not. It can help the compiler laying down the code in such a way that's optimal for the most frequent outcome.
They are used like this:
if (likely(some_condition)) {
// the compiler will try and make the code layout optimal for the case
// where some_condition is true, i.e. where this block is run
most_likely_action();
} else {
// this block is less frequently used
corner_case();
}
It should be used with great care (i.e. based on actual branch profiling results). A wrong hint can degrade performance (obviously).
Some examples of how the code can be optimized are easily found by searching for GCC __builtin_expect
. This blog post gcc optimisation: __builtin_expect for example details a disassembly with it.
The kind of optimizations that can be done is very processor-specific. The general idea is that often, processors will run code faster if it does not branch/jump all over the place. The more linear it is, and the more predictable the branches are, the faster it will run. (This is especially true for processors with deep pipelines for example.)
So the compiler will emit the code such that the most likely branch will not involve a jump if that's what the target CPU prefers, for instance.
Let's decompile to see what GCC 4.8 does with it
Without expect
#include "stdio.h"
#include "time.h"
int main() {
/* Use time to prevent it from being optimized away. */
int i = !time(NULL);
if (i)
printf("%d\n", i);
puts("a");
return 0;
}
Compile and decompile with GCC 4.8.2 x86_64 Linux:
gcc -c -O3 -std=gnu11 main.c
objdump -dr main.o
Output:
0000000000000000 <main>:
0: 48 83 ec 08 sub $0x8,%rsp
4: 31 ff xor %edi,%edi
6: e8 00 00 00 00 callq b <main+0xb>
7: R_X86_64_PC32 time-0x4
b: 48 85 c0 test %rax,%rax
e: 75 14 jne 24 <main+0x24>
10: ba 01 00 00 00 mov $0x1,%edx
15: be 00 00 00 00 mov $0x0,%esi
16: R_X86_64_32 .rodata.str1.1
1a: bf 01 00 00 00 mov $0x1,%edi
1f: e8 00 00 00 00 callq 24 <main+0x24>
20: R_X86_64_PC32 __printf_chk-0x4
24: bf 00 00 00 00 mov $0x0,%edi
25: R_X86_64_32 .rodata.str1.1+0x4
29: e8 00 00 00 00 callq 2e <main+0x2e>
2a: R_X86_64_PC32 puts-0x4
2e: 31 c0 xor %eax,%eax
30: 48 83 c4 08 add $0x8,%rsp
34: c3 retq
The instruction order in memory was unchanged: first the printf
and then puts
and the retq
return.
With expect
Now replace if (i)
with:
if (__builtin_expect(i, 0))
and we get:
0000000000000000 <main>:
0: 48 83 ec 08 sub $0x8,%rsp
4: 31 ff xor %edi,%edi
6: e8 00 00 00 00 callq b <main+0xb>
7: R_X86_64_PC32 time-0x4
b: 48 85 c0 test %rax,%rax
e: 74 11 je 21 <main+0x21>
10: bf 00 00 00 00 mov $0x0,%edi
11: R_X86_64_32 .rodata.str1.1+0x4
15: e8 00 00 00 00 callq 1a <main+0x1a>
16: R_X86_64_PC32 puts-0x4
1a: 31 c0 xor %eax,%eax
1c: 48 83 c4 08 add $0x8,%rsp
20: c3 retq
21: ba 01 00 00 00 mov $0x1,%edx
26: be 00 00 00 00 mov $0x0,%esi
27: R_X86_64_32 .rodata.str1.1
2b: bf 01 00 00 00 mov $0x1,%edi
30: e8 00 00 00 00 callq 35 <main+0x35>
31: R_X86_64_PC32 __printf_chk-0x4
35: eb d9 jmp 10 <main+0x10>
The printf
(compiled to __printf_chk
) was moved to the very end of the function, after puts
and the return to improve branch prediction as mentioned by other answers.
So it is basically the same as:
int i = !time(NULL);
if (i)
goto printf;
puts:
puts("a");
return 0;
printf:
printf("%d\n", i);
goto puts;
This optimization was not done with -O0
.
But good luck on writing an example that runs faster with __builtin_expect
than without, CPUs are really smart those days. My naive attempts are here.
C++20 [[likely]]
and [[unlikely]]
C++20 has standardized those C++ built-ins: https://stackoverflow.com/questions/51797959/how-to-use-c20s-likely-unlikely-attribute-in-if-else-statement They will likely (a pun!) do the same thing.