Injecting code into 32bit binary with ASLR

Porting the example from "Practical binary Analysis" book to a more general case


The book illustrates various method to inject assembly code into an ELF file. It also provides a small tool (elfinject.c) to automatize the process. In particular we will use the "primary" method, meaning overwriting an existing section header that is not fundamental for the ELF correct execution (in our case .note.ABI-tag), and the corresponding part in the program header.

The objective is to inject a simple "Hello word" into /bin/ls, without breaking it (meaning that ls should keep working as expected). For more details you should read the book.

The problem

The book presents already a tool called elfinject.c, which automatize the task. Peeking at the source code, it should also work on 32bit elf files. But, as in most of articles, the example provided did not take into consideration ASLR. Furthermore, the assembly code is for x64 architecture.

The machine I am working with is a virtualized Ubuntu (not the one used for the book), with default configuration, and no safety measure turned off.

michele@michele-VirtualBox:~/pba/code/chapter7$ uname -a
Linux michele-VirtualBox 4.15.0-111-generic #112-Ubuntu SMP Thu Jul 9 20:36:22 UTC 2020 i686 i686 i686 GNU/Linux

Obviously, just running the example does not work. The first problem we have to face is the wrong assembly code. The provided hello word.s is the following:


global main

push rax ; save all clobbered registers
push rcx ; (rcx and r11 destroyed by kernel)
push rdx
push rsi
push rdi
push r11

mov rax,1 ; sys_write
mov rdi,1 ; stdout
lea rsi,[rel $+hello-$] ; hello
mov rdx,[rel $+len-$] ; len

pop r11
pop rdi
pop rsi
pop rdx
pop rcx
pop rax

push 0x4049a0 ; jump to original entry point

hello: db "hello world",33,10
len : dd 13

The solution

Here's my 32bit version of the hello world assembly file:


global main


push ecx
push edx
push esi
push edi

mov ebx,1 ; stdout ARG0 for x86 32bit
mov ecx, [esp]
lea ecx, [ecx + hello]

mov edx, [esp]
mov edx, [edx + len]
mov eax,4
int 0x80

mov eax, [esp]
sub eax, 0x411772

pop edi
pop esi
pop edx
pop ecx

jmp eax

hello: db "hello world",33,10
len : dd 13

Let's take a look at the new hello32.s, it starts with BITS 32, this is the obvious first change. The next change is in the registers saved:

push ecx
push edx
push esi
push edi

Indeed, x86 32bit does not have the 64bit registers, (no r**) and neither r** (more info about registers). I don't save the eax and this will be clear later why.

The second important problem was the interrupt call. x86 does not use syscall, so I had to use int 0x80. This also requires a different way to:

1. provide the arguments
2. choose the number of the system call (with respect to the x64 example).

We cannot simply replace the 64bit registers with their 32bit part, because the order used is different (order). Furthermore, the call associated with write is not 1 but 4 (syscall list).

Another issue was regarding the address of strings. Simply porting the code provided did not work, because the memory pointed to (- for example for the string hello -) was not being updated, or corretly referenced relatively to the address. So to reference it correcty, I had to use a little trick:

mov ecx, [esp]
lea ecx, [ecx + hello]

mov edx, [esp]
mov edx, [edx + len]

This make sure that at runtime the address for the string and the size is correct, no matter how we modify the source.

Another issue encountered was the presence of ASLR. As I said, I will assume zero knowledge about it, and how to work around it. Reading ls headers with readelf, it prints it as a a shared object:

michele@michele-VirtualBox:~/pba/code/chapter7$ readelf /bin/ls -h
Intestazione ELF:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Classe: ELF32
Dati: complemento a 2, little endian
Versione: 1 (current)
SO/ABI: UNIX - System V
Versione ABI: 0
Tipo: DYN (file oggetto condiviso)

After a bit of exploration, we can see that the addresses shown by readelf (e.g. the entry point), and the actual loaded in memory, are different. This can be quite annoying for debugging.

Indirizzo punto d'ingresso: 0xfad

This address does not correspond to the real virtual address of the entry point. We can verify it with gdb:

(gdb) b *0xfad
Breakpoint 1 at 0xfad
(gdb) r
Starting program: /bin/ls
Cannot insert breakpoint 1.
Cannot access memory at address 0xfad

Luckily for us, after having run the binary we can obtain the real address:

(gdb) info file
Symbols from "/bin/ls".
Native process:
Using the running image of child process 14081.
While running this, GDB does not access memory from...
Local exec file:
`/bin/ls', file type elf32-i386.
Entry point: 0x403f86

At this point, one could think that doing a jmp to 0x403f86 (at the end of our assembly code) would work:

push 0x403f86

Unfortunately it works only with gdb, but crashes running it from the terminal. We can speculate that the address handling might be different in these two contex. So we might wanted to obtain a relative jump from the position of the code, to the original entry point, assuming this distance is fixed.

We can find the distance debugging it, and then we can implement a relative jump, with a similar method as for the data addresses:

mov eax, [esp]
sub eax, 0x411772
jmp eax

An important thing to keep in mind is to do the pop after you used the [esp] value, because otherwise addressing [esp] does not provide the correct address.

After all these modification, we are finally able to inject and run from the terminal. After having copied /bin/ls to ls_mod, we compile the assembler file with -f bin flag, and we inject it.

nasm -f bin hello32.s -o hello32.bin
./elfinject ls_mod hello32.bin ".injected" 0x00415180 0

michele@michele-VirtualBox:~/pba/code/chapter7$ ./ls_mod
hello world!
elfinject heapoverflow.c hello-ctor.s hello-got.s new_headers shell.c
elfinject.c hello32.bin hello_fixed2_32.bin hello.s original_headers
encrypted hello.bin hello_fixed32.bin ls_mod shell_asm.bin
heapcheck.c hello-ctor.bin hello-got.bin Makefile shell_asm.s

As we can see, running our injected binary provides both the Hello World, and its normal output.

More resources - what is going on?

What is happening is that due to ASLR, the addresses are being randomized. You can find more details here.

Debugging the binary with dbg disables ASLR. This is why we always get the same "original" entry point with gdb, and also why the injection would work without a relative jump. ASLR can be re-enabled in gdb with set disable-randomization off.

The binary was listed as shared object, because comping with ASLR enabled results in a PIE (Position-Independent Executable) binary.