UIUCTF20 was a really fun Animal Crossing themed CTF that ran from July 17-19 2020. While I have not played the game before, I somewhat knew what it was about from watching Youtubers play it, and also from memes about the turnip stock market.

Our team PPP came in third place, which went above my expectations as most people playing were relatively new and I am quite happy with the result. Now, on to the writeup!

### Accounting Accidents

You can download the binary here (sha256sum: fcdc3991bed89b8aa5b509c4b6e205967a1e0b9e3fd041547492ebb75202130f). This writeup is targeted at beginners to pwn, where I will explain concepts more thoroughly and avoid making leaps in logic.

I did not look at the flavortext before solving this challenge, taking it as a hint only when I get stuck. I will take a similar approach for this writeup, where we will discover together what the binary is doing.

Let’s first check the security attributes of the binary:

Interesting. It’s unlikely to be a ROP challenge since there is a canary, and NX means no executing shellcode on the stack. Let’s open up the binary in Ghidra and analyse it.

Next run file:

Awesome, it is not stripped. This means that debugging symbols like function names are still left in the binary, which should make reversing it a little easier.

### Running

When you run the binary, there is a bunch of text that is printed out character by character. Isabelle then asks us for the name of an item, and then the cost of 4 other items. Afterwards, she puts the item through her accounting software, and outputs the results. It categorizes things into left and right. No idea what that means for now.

We get a leak for the .text section (which is honestly pointless since there is no PIE (position independent execution)), and if we look at the source afterwards the address actually corresponds to the print_flag function that basically does what it says. I also tried fuzzing it with large inputs to see if it causes a segfault from buffer overflows, followed by testing “%x” inputs to check for format string vulnerabilies. These are quick and simple ways of testing for any obvious vulnerabilities before diving in to reversing. None of those worked, so let’s get to reversing!

### Reversing

Opening the file in Ghidra, we get the following main function:

The print statements pretty up matches to what we saw when running the program. We also see a read in line 37, but it reads in 8 bytes into a buffer of 8 bytes, so there is no overflow for this. fancy_print is what probably prints the buffer character-by-character, so I did not bother reversing it. I thought it was annoying at first, but then I remembered the Animal Crossing theme, so afterwards I thought it was cute.

The most interesting part is in line 50, where we see a function pointer being called: (**(code **)(local_13c + 0x20))(local_13c);. local_13c is probably a struct of some sort, and has a function pointer at offset 0x20. It passes itself as an argument to the function.

Let’s take a look at insert:

We see the functions newNode, getBalance, leftRotate, and rightRotate being used. From our foundational CS classes we know that rotations is a term used to describe the process of balancing binary trees, and newNode confirms this suspicion. Furthermore, the left and right we saw previously makes more sense in the context of trees.

Let’s look at newNode:

We see that puVar is returned from malloc, and afterwards we are setting values to array indices of it. This pattern strongly indicates that puVar1 is a struct with fields which are a multiple of 4 bytes (since Ghidra detected that puVar1 is a undefined4 *), and that we are setting the fields in the struct.

0x80487a6 looks like a pointer. If we go to the address in the disassembly, we end up precisely at the print_edges function, which looks like follows:

That was what we saw previously at the end when running the program. Also, recall that line 50 in main was calling a function at offset 0x20, while the pointer is being set at offset 8 * 4 = 32 = 0x20 as well, so we can confirm that this function pointer is what is being called at the end. If we could overwrite the function pointer, we have EIP control!

### Finding the Vulnerability

Now back to the disassembly of newNode. We see a fgets being performed in line 25. It is writing to puVar1 + 4, and since puVar1 is a pointer to a 4 byte type, this is a 4 * 4 = 16 bytes offset. So we can read up to 0x15 bytes to &puVar[4]. However, we only need 0x10 bytes to reach &puVar[8], and then we have another 5 bytes overflow which is more than enough to overwrite the function pointer!

Now that we have a buffer overflow vulnerability in hand and we can control EIP, let’s see if there are already good existing targets to jump too. We recall that we have the print_flag function to use. Awesome!

### Developing the Exploit

We see that the vulnerable code path only gets executed in newNode when param_2 is NULL. If we trace the code from the caller in main, this corresponds to the third argument for insert, and it is only in line 23 of main where it is set to NULL: local_13c = insert(uVar1,0x19,0);. This means we only get to overwrite the function pointer of one node that has the second parameter (which turns out to be the price) as 0x19, or 25.

Also, knowing that the data structure is most likely that of a tree, the disassembly of setting uVar1 again and again makes sense now. uVar1 is basically the root of the tree, and when we insert a node into a tree, we are getting the new root back each time. Then, the function pointer of the last node to be the root is called.

I tried to see if we can get lucky by just exploiting the overflow and seeing if the node that has the overflow ended up as the eventual root, but it did not work, as expected.

I then began annotating the insert function, to try to figure out the structure of the tree and see whether that gives us any insights:

The tree is being sorted by price. getBalance returns the difference in height between the left and right child. leftRotate and rightRotate was a bit involved and I did not bother reversing them, as it is probably what it says it does.

While doing so, I also reversed the structure of the node, with the offsets and datatypes given below:

The fields are as we would expect.

### Possibly an AVL tree?

If we look at the code, we see that a rotation (either left or right) is done when the absolute difference between the heights of the subtrees exceeds 1. This brings back memories from our intro CS class, which introduced the AVL tree that has the property that the “heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property”. So we are most likely looking at an AVL tree!

Let’s confirm our suspicions, and see the output of the program when we insert nodes of a particular value. Before we do that, let’s reverse print_edges so we know what exactly is being printed:

So the left section just keeps traversing down and printing the left children, while the right section does the opposite. Cool!

Initially, the program inserts nodes with prices [10, 20, 30, 40, 50, 25] into the tree. Let’s run the program again, and use inputs [1, 2, 3, 4] for the prices, which are inserted after.

I then inserted the same nodes in order to VisuAlgo (click on the AVL tree button on the header), which is a great website created by Dr. Steven Halim for visualizing algorithms and data structures. By the way, if you are interested in competitive programming, I would highly recommend his book Competitive Programming 4 which was just released a few days ago on July 19. I found the 3rd edition of the book invaluable when I did competitive programming in high school, and the knowledge basically tided me through most of the algorithm classes in college. You can find more information about his book here (no, I am not receiving commission for this, nor was I asked to do this, I just really liked his book).

We see that the above picture has 2 and 1 when taking only left subtrees, and 30, 40, and 50 when only taking the right subtrees, just as what was being output by the binary. Nice!

So now, our final challenge is to insert four nodes such that the node with our payload that has value 0x19=25 ends up at the root.

The tree initially before any of our custom nodes are inserted looks like the following:

For 25 to be the root, intuition tells us that we need to make the left part of the tree heavier and eventually force node 25 to bubble up. The way we’ll do this, however, is not so intuitive. We first want to force 25 to make a rotation upwards. This can be done by giving it more children. We insert nodes with values 26 and 27:

“WHATTT?!”, I hear you say. Did we just make things worse, since node 25 is in an even lower position now?

This is the unintuitive part, because what we were really doing is to give some nodes to the right subtree of 20. Now, we insert node 21, which forces the node at 20 to undergo a rotation:

Now we have node 25 at a rather sweet spot. Finally, let’s cause a rotation at 30 by causing one of its child to increase in height. We can grow the height of the subtree rooted at 20 by either adding something less than 10, or something between 21 or 25. Let’s go with adding 22:

We did it!

### Full Exploit

The full exploit script is given below:

Running the exploit script against the challenge server:

And just like that we get our flag!

### About The Banner Picture

This picture was taken during my hike up Mt. Tallac at Lake Tahoe last summer (2019). You can see a glimpse of the lake in the background. The hike was around 10 miles, with an elevation gain of around 3.3k ft.

### Closing

This is my first CTF writeup, and I tried to be as descriptive as possible and explain my thought process every step of the way. It look a lot longer than I expected to write it up, so I hope that it was helpful to you!