What is the purpose of the RBP register in x86_64 assembler?

Linux uses the System V ABI for x86-64 (AMD64) architecture; see System V ABI at OSDev Wiki for details.

This means the stack grows down; smaller addresses are "higher up" in the stack. Typical C functions are compiled to

        pushq   %rbp        ; Save address of previous stack frame
        movq    %rsp, %rbp  ; Address of current stack frame
        subq    $16, %rsp   ; Reserve 16 bytes for local variables

        ; ... function ...

        movq    %rbp, %rsp  ; \ equivalent to the
        popq    %rbp        ; / 'leave' instruction
        ret

The amount of memory reserved for the local variables is always a multiple of 16 bytes, to keep the stack aligned to 16 bytes. If no stack space is needed for local variables, there is no subq $16, %rsp or similar instruction.

(Note that the return address and the previous %rbp pushed to the stack are both 8 bytes in size, 16 bytes in total.)

While %rbp points to the current stack frame, %rsp points to the top of the stack. Because the compiler knows the difference between %rbp and %rsp at any point within the function, it is free to use either one as the base for the local variables.

A stack frame is just the local function's playground: the region of stack the current function uses.

Current versions of GCC disable the stack frame whenever optimizations are used. This makes sense, because for programs written in C, the stack frames are most useful for debugging, but not much else. (You can use e.g. -O2 -fno-omit-frame-pointer to keep stack frames while enabling optimizations otherwise, however.)

Although the same ABI applies to all binaries, no matter what language they are written in, certain other languages do need stack frames for "unwinding" (for example, to "throw exceptions" to an ancestor caller of the current function); i.e. to "unwind" stack frames that one or more functions can be aborted and control passed to some ancestor function, without leaving unneeded stuff on the stack.

When stack frames are omitted -- -fomit-frame-pointer for GCC --, the function implementation changes essentially to

        subq    $8, %rsp    ; Re-align stack frame, and
                            ; reserve memory for local variables

        ; ... function ...

        addq    $8, %rsp
        ret

Because there is no stack frame (%rbp is used for other purposes, and its value is never pushed to stack), each function call pushes only the return address to the stack, which is an 8-byte quantity, so we need to subtract 8 from %rsp to keep it a multiple of 16. (In general, the value subtracted from and added to %rsp is an odd multiple of 8.)

Function parameters are typically passed in registers. See the ABI link at the beginning of this answer for details, but in short, integral types and pointers are passed in registers %rdi, %rsi, %rdx, %rcx, %r8, and %r9, with floating-point arguments in the %xmm0 to %xmm7 registers.

In some cases you'll see rep ret instead of rep. Don't be confused: the rep ret means the exact same thing as ret; the rep prefix, although normally used with string instructions (repeated instructions), does nothing when applied to the ret instruction. It's just that certain AMD processors' branch predictors don't like jumping to a ret instruction, and the recommended workaround is to use a rep ret there instead.

Finally, I've omitted the red zone above the top of the stack (the 128 bytes at addresses less than %rsp). This is because it is not really useful for typical functions: In the normal have-stack-frame case, you'll want your local stuff to be within the stack frame, to make debugging possible. In the omit-stack-frame case, stack alignment requirements already mean we need to subtract 8 from %rsp, so including the memory needed by the local variables in that subtraction costs nothing.


rbp is the frame pointer on x86_64. In your generated code, it gets a snapshot of the stack pointer (rsp) so that when adjustments are made to rsp (i.e. reserving space for local variables or pushing values on to the stack), local variables and function parameters are still accessible from a constant offset from rbp.

A lot of compilers offer frame pointer omission as an optimization option; this will make the generated assembly code access variables relative to rsp instead and free up rbp as another general purpose register for use in functions.

In the case of GCC, which I'm guessing you're using from the AT&T assembler syntax, that switch is -fomit-frame-pointer. Try compiling your code with that switch and see what assembly code you get. You will probably notice that when accessing values relative to rsp instead of rbp, the offset from the pointer varies throughout the function.