Function Calls in x86 Assembly

Function Calls in x86 Assembly

·

4 min read

The following code shows a simple C program that contains 2 function calls getenv to get the value of an environmental variable specified in argv[1]. Then, it prints this value with printf

#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
    printf("%s=%s\n", argv[1], getenv(argv[1]));

    return 0;
}

The following shows the corresponding assembly code, obtained by compiling the the C program with gcc 5.4.0 and then disassembling it with the objdump

Contents of section .rodata:
400630 01000200 25733d25 730a00 ....%s=%s..
Contents of section .text:
0000000000400566 <main>:
400566: push rbp
400567: mov rbp,rsp
40056a: sub rsp,0x10
40056e: mov DWORD PTR [rbp-0x4],edi
400571: mov QWORD PTR [rbp-0x10],rsi
400575: mov rax,QWORD PTR [rbp-0x10]
400579: add rax,0x8
40057d: mov rax,QWORD PTR [rax]
400580: mov rdi,rax
400583: call 400430 <getenv@plt>
400588: mov rdx,rax
40058b: mov rax,QWORD PTR [rbp-0x10]
40058f: add rax,0x8
400593: mov rax,QWORD PTR [rax]
400596: mov rsi,rax
400599: mov edi,0x400634
40059e: mov eax,0x0
4005a3: call 400440 <printf@plt>
4005a8: mov eax,0x0
4005ad: leave
4005ae: ret

The compiler stores the string constant %s=%s used in the printf call separately from the code, in the .rodata (read-only data) section at address 0x400634. You'll see this address used later in the code as a printf argument.

In principle, each function in an x86 Linux program has its own function frame (also called stack frame) on the stack, delimited by rbp (the base pointer) pointing to the base of that function frame and rsp pointing to the top. Function frames are used to store the function's stack-based data.

The following figure shows the function frames created for main and getenv when you run the program:

The first thing main does is run a prologue that sets up its function frame. This prologue starts by saving the contents of the rpb register on the stack and then copying rsp into rbp:

0000000000400566 <main>:
400566: push rbp
400567: mov rbp,rsp

After setting up a basic function frame, main decrements rsp by 0x10 bytes to reserve room for two 8-byte local variables on the stack:

40056a: sub rsp,0x10

On x86-64 Linux systems, the first six arguments to a function are pushed in rdi, rsi, rdx, rcx, r8, and r9, respectively. Now what happens when there are more than six arguments or some arguments don't fit in a 64-bit register?. If that's the case the remaining arguments are pushed onto the stack in reverse order as follows:

mov rdi, param1
mov rsi, param2
mov rdx, param3
mov rcx, param4
mov r8, param5
mov r9, param6
push param9
push param8
push param7

After reserving a room on the stack, main copies argc (stored in rdi) into one of the local variables and argv (stored in rsi) into the the other:

40056e: mov DWORD PTR [rbp-0x4],edi
400571: mov QWORD PTR [rbp-0x10],rsi

Preparing Arguments and Calling a Function

After the prologue, main loads argv[1] into rax by first loading the address of argv[0] and then adding 8 bytes (the size of of the pointer) dereferencing the resulting pointer to argv[1]. It copies this pointer into rdi to serve as the argument for getenv:

400580: mov rdi,rax

The calls getenv:

400583: call 400430 <getenv@plt>

The call instruction automatically pushes the return address (the address of the instruction right after the call) onto the stack, where getenvwill find it where it returns which is 0x400588

Finally, getenv executes a ret instruction that pops the return address from the stack and returns there, restoring control to main.

Reading Return Values

The main function copies the return value into rdx to serve as the third argument of the printf call. Next main loads argv[1] again in the same way as before and stores it in rsi as the second argument for printf

400588: mov rdx,rax
40058b: mov rax,QWORD PTR [rbp-0x10]
40058f: add rax,0x8
400593: mov rax,QWORD PTR [rax]
400596: mov rsi,rax

After preparing the arguments, main calls printf:

4005a3: call 400440 <printf@plt>

Returning from a function

After printf completes, main prepares its own return value (the exit status) by zeroing out the the rax register:

4005a8: mov eax,0x0

then it executes a leave instruction, which is x86's shorthand instruction for:

mov rsp,rpb  ;epilogue boi that does the opposite of the prologue
pop rpb

Finally, main executes a ret instruction, which pops the saved return address from the top of the stack and returns there, ending control back to whatever function called main

::::: Credits to PBA book from no starch press ::::::