Basics: Understanding the stack in assembly

In this post, I’d like to talk about the stack and how it works in assembly. We will also examine the stack with gdb. Understanding the stack is crucial for reverse engineering or writing certain types of exploits.

Before reading this, you should already have a basic idea of what processor registers are (at least you should know that you can store data there) and not be afraid of dealing with a few simple assembly instructions. If you want to follow along with the instructions, you should have a Linux system with gcc and gdb ready.

Theory

The stack is a simple last in first out (LIFO) structure, meaning what you put last in you get first out. You may visualize it like a stack of blocks in memory:

             ______
 ______     |______|     ______
|______|    |______|    |______|
|______|    |______|    |______|

initial      *push*       *pop*

Values are placed onto the stack via push and read (and removed) via pop. So when you see a call to push eax in assembly, it means that the value currently stored in register eax is pushed on top of the stack. When you see a pop eax it means that the value currently on top of the stack is loaded into eax and it is removed from the stack (so the next call to pop would read the next value from the stack etc.).

To keep track of the stack, the system uses the base pointer ebp and the stack pointer esp, whereas esp points to the top of the stack and ebp points to its bottom. When you push or pop from the stack, esp is adjusted accordingly. The memory address contained in esp is increased when you pop from the stack and decreased when you push to the stack. This may be surprising, as one would expect the stack to grow towards higher memory (and then the memory address in esp should increase when you put something on the stack and vice versa). But that is not the case.

A small but very important caveat is that the stack grows downwards in memory. So it starts at a high memory address and extends into lower memory regions. Therefore, ebp will usually be greater (in terms of memory regions) than esp.

         Stack top     Lower Memory

            /\
            :          extends here
            :
          ______    
esp ---> |______|      0x080483f0       
         |______|      0x080483f4    
ebp ---> |______|      0x080483f8

        Stack bottom   Higher Memory

We can observe the stack when we analyze our first small C program:

int main()
{
   int a = 10;
   int b = 5;
   int c = 2;
   return 0;
}

This program does nothing more than declaring three local variables - variables we will observe in memory and on the stack soon. We save this program in a file called stack1.c and compile it with gcc -o stack1 -m32 stack1.c

The -o option specifies the output file and -m32 produces a 32-bit binary, which is necessary when you work on a 64-bit system and want to observe typical 32-bit assembly. Some things are different in 64-bit but we might get to that later.

The compilation leaves us with a file called stack1. To decompile it, you can use a disassembler of your choice, like objdump, radare2 or Hopper, or we simply use the disassembly capabilities of gdb.

Call your program with

gdb stack1

to fire up gdb:

$ gdb stack1
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from stack1...(no debugging symbols found)...done.

Then we set the type of disassembly code produced to Intel syntax:

(gdb) set disassembly intel

and show the gdb disassembly of the main function:

(gdb) disas main
Dump of assembler code for function main:
   0x080483ed <+0>:	push   ebp
   0x080483ee <+1>:	mov    ebp,esp
   0x080483f0 <+3>:	sub    esp,0x10
   0x080483f3 <+6>:	mov    DWORD PTR [ebp-0xc],0xa
   0x080483fa <+13>:	mov    DWORD PTR [ebp-0x8],0x5
   0x08048401 <+20>:	mov    DWORD PTR [ebp-0x4],0x2
   0x08048408 <+27>:	mov    eax,0x0
   0x0804840d <+32>:	leave  
   0x0804840e <+33>:	ret  

When you decompile the whole file, e.g. with objdump, you will notice a lot of other stuff which has to do with the ELF file structure but that is not important here.

In the left column you see a bunch of addresses, followed by their offset from the the starting point of the function (<+X>) followed by assembly translations of the bytes stored in these addresses. Since it would be hard to read the machine code, which the bytes represent, directly, a disassembler provides us with a human readable translation of this machine code.

If we look at line <+0> to <+3> we see a typical function prologue. It saves ebp by pushing it onto the stack (ebp is usually 0 here) and sets ebp to the value currently stored in esp with the mov instruction. Then it subtracts some value from esp to create a new stack frame for our function. (You can think of a stack frame as a space within memory that is reserved for our function’s stack.) So by the time this prologue has finished, we have pushed the old value in ebp on the stack and set ebp to the bottom of our new stack, while esp points to the top of our stack. And we also have reserved some space for our local variables by subtracting from esp.

Lets have a closer look at <+3>, the last line of the function prologue. We see, that there are 16 bytes (0x10 in hex = 16 in decimal) subtracted from the current stack pointer address, so our stack grows by 16 bytes to make room for our local variables. Why 16 bytes? An integer is, at least in a typical Intel 32-bit system, 4 bytes, so we need 12 bytes for our 3 local variables. Why does the compiler reserve 16 bytes? The other 4 bytes are usually reserved to align the stack to a multiple of 16 bytes, which is easier to process. So if we would declare 4 local integer variables, the stack would still have a length of 16 bytes. But if we declared 5 integer variables, which means we needed 20 bytes, our stack would get a length of 32 (= 2*16) bytes. Try it out!

In lines <+6> to <+20> the local variables are written on to the stack. This is done by moving them to locations that are determined by their offset from ebp, our base pointer. So we place 10 in the memory location at ebp - 0xc, 5 at ebp-0x8 and 2 at ebp - 0x4.

In the end, our stack looks like this (imagine every block to have a height of 4 bytes):

     [top]
 _____________    <---esp           
|      ?      |  (ebp - 0x10)
|_____________|  
|             |  (ebp - 0xC)
|_____10______|           
|             |  (ebp - 0x8)    
|______5______|  
|             |  (ebp - 0x4)  
|______2______|  
    [bottom]     <---ebp

As you notice, the variables on the stack are referenced by using the base register ebp. Why doesn’t the compiler use esp to reference the variables? Since esp always points to the top of the stack and might change when the stack grows, it is easier to use the fixed ebp, so you will always find you variables at the same relative offset from the bottom of the stack where ebp points to (instead of recalculating offsets from its top where esp points to). [However, when you start dealing with compiler optimizations, this might not always be true as there are such things as frame pointer omission. If you want to read more about esp and ebp, have a look here.]

Another thing to note is the order in which the variables are placed on the stack. We declared a = 10, b = 5 and c = 2. As you notice, c is now at the bottom of the stack, followed by b and c, where c was declared last. The variables are placed on the stack in reverse order. That makes sense, since if we started to take them from the stack via a pop instruction, we would get them in the right order (remember that the stack is a LIFO structure, so when you access it, you get the value on the top first).

Finally, we put the return value 0 in the eax register <+27> and leave the function with the function epilogue in <+32> and <+33>.

So much for the theory. Let’s now see how all of this looks like when we use gdb to examine our program.

Practice

If you haven’t done so, call your program with

gdb stack1

We start at the very beginning of the main function and set our breakpoint to the first line:

(gdb) break *main
Haltepunkt 1 at 0x80483ed

Then we start the execution with the command r or run:

(gdb) r
Starting program: /home/michael/Entwicklung/Disassemble/stack1

Breakpoint 1, 0x080483ed in main ()

As you can see the compiler tells us the memory address in the instruction pointer so we now where we are in the program’s execution (- have a look at the assembly above, you will find the address there). Note that gdb always stops before the instruction is executed, so *main+0 hasn’t been executed and we haven’t pushed ebp yet. We can start to inspect registers and memory. Let’s see what’s currently in ebp and esp:

(gdb) info registers
eax            0x1	1
ecx            0xc0c8ddda	-1060577830
edx            0xffffcd54	-12972
ebx            0xf7fac000	-134561792
esp            0xffffcd2c	0xffffcd2c
ebp            0x0	0x0
esi            0x0	0
edi            0x0	0
eip            0x80483ed	0x80483ed <main>
eflags         0x246	[ PF ZF IF ]
cs             0x23	35
ss             0x2b	43
ds             0x2b	43
es             0x2b	43
fs             0x0	0
gs             0x63	99

info registers shows us the content of all registers. We could also abbreviate it with i r or display the content of a specific register with i r [register]

So we see that ebp is 0. (This is due to the Application Binary Interface and Intels Binary Compatibility Standards - you can find more info here). eip points to the instruction we are about to execute.

Furthermore, esp contains the address 0xffffcd2c, an address, that our application inherited from the OS. We carry on to the next instruction and inspect the registers again:

(gdb) nexti
0x080483ee in main ()
(gdb) i r
[...]
esp            0xffffcd28	0xffffcd28
ebp            0x0	0x0
[...]
eip            0x80483ee	0x80483ee <main+1>
[...]

We have now pushed ebp, which is 0, on the stack. This decreased esp by 4 bytes, as in hex ffffcd2c - ffffcd28 = 4. (When something is pushed on or popped off the stack, esp will always decrease or increase accordingly, so it always points at the top of the stack.)

To make sure, we would expect the memory content on ffffcd28 (reading 4 bytes towards higher memory region) to be 0. Let’s verify:

(gdb) x/wx 0xffffcd28
0xffffcd28:	0x00000000

The command x/wx is equivalent to x/1wx and tells gdb to read 1 Word (which is 4 bytes in the 32-bit architecture) from the memory location given as argument. The first x is for “examine”, examining the memory. The number determines the amount, the next character ‘x’ says in what chunks we want the memory displayed (e.g. 4b would mean 4 bytes, 4w = 4 words etc.) and the last character tells gdb that we want the memory displayed in hexadecimal values (we could choose other formats like ascii or machine instructions here). Have a look in the gdb manual to get more familiar with this syntax. As expected, we find the memory at the location to which we pushed ebp to be 0. No surprise here. Let us continue.

(gdb) nexti 0x080483f0 in main () (gdb) i r […] esp 0xffffcd28 0xffffcd28 ebp 0xffffcd28 0xffffcd28 […]

We moved the value of esp into ebp and they now both contain the same value and therefore point to the same memory address. On execution of the next instruction, we create our function specific stack frame:

(gdb) nexti
0x080483f3 in main ()
(gdb) i r
[...]
esp            0xffffcd18	0xffffcd18
ebp            0xffffcd28	0xffffcd28
[...]

We executed sub esp,0x10, which means we decreased esp by 16 bytes and thereby reserved 16 bytes for our stack frame. esp now points to the top of our stack and ebp to its bottom, while the top lies in a lower memory address than the bottom. So our stack just grew downwards in memory as explained before.

In the next three steps we can see how our variables are placed on the stack. First, let’s have a look at the initial memory values on the stack:

(gdb) x/4wx 0xffffcd18
0xffffcd18:	0x0804841b	0xf7fac000	0x08048410	0x00000000

Then we continue to place our variables there:

(gdb) nexti
0x080483fa in main ()

(gdb) x/4wx 0xffffcd18
0xffffcd18:	0x0804841b	0x0000000a	0x08048410	0x00000000

(gdb) nexti
0x08048401 in main ()

(gdb) x/4wx 0xffffcd18
0xffffcd18:	0x0804841b	0x0000000a	0x00000005	0x00000000

(gdb) nexti
0x08048408 in main ()

(gdb) x/4wx 0xffffcd18
0xffffcd18:	0x0804841b	0x0000000a	0x00000005	0x00000002   

As you can see, the stack gets filled from bottom to top: First, we write the value 0xa, overwriting 0xf7fac000, then we write 0x5, overwriting 0x08048410, and finally we write 0x2 overwriting 0x00000000. So here are our local variables placed on the stack and the leftover value 0x0804841b is still at its bottom. If we examine our registers again, we notice that neither esp nor ebp did change as we worked with the stack:

(gdb) i r
[...]
esp            0xffffcd18	0xffffcd18
ebp            0xffffcd28	0xffffcd28    [...]

This is the end of this small tutorial. We saw that our stack grew downwards and also the values were written from the bottom to the top. We also observed how the stack was created in the beginning and examined the memory associated with it. I hope this little tutorial helped you to get a better grasp of this important memory concept.

Contact me on Twitter if you found a mistake or this tutorial helpful.

If you want to read more about the stack, there are already many helpful ressources on the net.