Hello from Part 6 of this series on cracking CMU’s Bomblab with Angr. If you are new here, I would recommend starting with part 1 here.

### Phase 6

Let’s disassemble Phase 6:

It looks like it is using the same read_six_numbers function from Phase 2, and the six numbers are at the bottom of the stack again. One small difference is that before read_six_numbers is called, r13 is set to rsp in line 9. Since r13 is a callee saved register, we need to initialize it to rsp.

With this done, that means we can just re-use the rest of the code for Phase 2 and it will just work, right? Rightttttt?

Well, no, unfortunately not. I tried it and the script ran for 1 minute… 2 minutes… 5 minutes… 10 minutes…, and I did not have a good feeling about this as my laptop was becoming something of a portable heater at this point. From the disassembly you can see that there are a lot of jumps and comparisons, way more than in previous phases, and this is leading to state explosion. We need to find a better way.

### Mitigating State Explosion by Splitting Up Analysis

We mitigate the exponential growth in states by decreasing the depth of the search tree of the simulation manager. This can be done by splitting up the function into multiple distinct parts, and then searching through each of them in order. This way, we will search through multiple trees of smaller depths, which could make the problem tractable. Of course, the assumption here is that the function can indeed be split into distinct parts. Let’s try to verify this assumption by seeing how the code jumps around. I decided to do this with Radare, which allows us to see this in a more graphical manner. We could manually eyeball it with GDB’s output as well, but that is just too painful.

Let’s open up Radare, and seek to our phase_6 function with s:

Now output the disassembly with pdf, which stands for Print Disassemble Function:

You can see the lines on the left going up and down. Those represents the jumps that can be taken. We also see that there are some obvious blocks in the structure. My original intuition was that the address at 0x00401158 seemed like a pretty good place to split the blocks, since it just finished a bunch of complicated logic in the first half and it looked like a natural transition point. Let’s see how this looks like in code:

I called the two blocks that we divide the function into as block 1 and block 2. block_1_end is where we split the two blocks apart. block_2_end is simply the end of the function.

An important point to highlight is that instead of just performing simulation.explore and simulation.found like previously, now we need to find all the states that can reach block_1_end. If we just did simulation.found instead, it would stop at the first valid solution that it found, which may not include the states that are actually necessary for us to get to the final solution. Therefore, we keep exploring while simulation.active is true, which means that there are still available states to explore.

This means that block_1_states will hold all the possible states that can reach block_1_end, and we can then use each of them as a launching point to reach the end of the function:

In the above, we begin afresh from each of the states that we found in block 1, and then try to see if it leads to a solution. This time round we only need to grab the first valid solution, like usual.

I tried running the script:

What?! There is only 1 state found after block 1? That seems like we did not reduce the depth of the search tree by much, since we did not reduce the number of times it branched. Indeed, I kept waiting and waiting and…there was no additional output.

For this to work, we need more states to be found after block 1, the more the merrier (but not so much that now you have state explosion in block 1).

The next address I tried to set block_1_end to was 0x00401181, which is right after another distinct mini logic block:

We found 10 states after block 1. This is what we like to see! And indeed, after a few minutes of waiting, we get our well-deserved result: 4 3 2 1 6 5.

### Full Solution Script

Let’s try it on the actual binary:

It worked! We’ve come a long way and you should be proud of yourselves. If any of you actually solved this before (wink wink 15-213 students), you would remember how long this phase took, where you would slowly realise that you were dealing with a linked list and then you probably had to graph out the pointers in the linked list to figure out what their relationships were, but Angr solved it without us having to be aware of what is happening at all!

Thanks for reading thus far, and I hope to see you again for the last and final phase - the Secret Phase, which you can find here.