Difficulty understanding logic in disassembled binary bomb phase 3

The function makes a modified copy of a string from static storage, into a malloced buffer.


This looks weird. The malloc size is dependent on strlen+1, but the memcpy size is a compile-time constant? Your decompilation apparently shows that address was a string literal so it seems that's fine.

Probably that missed optimization happened because of a custom string_length() function that was maybe only defined in another .c (and the bomb was compiled without link-time optimization for cross-file inlining). So size_t len = string_length("some string literal"); is not a compile-time constant and the compiler emitted a call to it instead of being able to use the known constant length of the string.

But probably they used strcpy in the source and the compiler did inline that as a rep movs. Since it's apparently copying from a string literal, the length is a compile-time constant and it can optimize away that part of the work that strcpy normally has to do. Normally if you've already calculated the length it's better to use memcpy instead of making strcpy calculate it again on the fly, but in this case it actually helped the compiler make better code for that part than if they'd passed the return value of string_length to a memcpy, again because string_length couldn't inline and optimize away.


   <+0>:     push   %edi // push value in edi to stack
   <+1>:     push   %esi // push value of esi to stack
   <+2>:     sub    $0x14,%esp // grow stack by 0x14 (move stack ptr -0x14 bytes)

Comments like that are redundant; the instruction itself already says that. This is saving two call-preserved registers so the function can use them internally and restore them later.

Your comment on the sub is better; yes, grow the stack is the higher level semantic meaning here. This function reserves some space for locals (and for function args to be stored with mov instead of pushed).


The rep movsd copies 0x13 * 4 bytes, incrementing ESI and EDI to point past the end of the copied region. So another movsd instruction would copy another 4 bytes contiguous with the previous copy.

The code actually copies another 2, but instead of using movsw, it uses a movzw word load and a mov store. This makes a total of 78 bytes copied.

  ...
      # at this point EAX = malloc return value which I'll call buf
<+28>:    mov    $0x804a388,%esi            # copy src = a string literal in .rodata?
<+33>:    mov    $0x13,%ecx
<+38>:    mov    %eax,%edi                  # copy dst = buf
<+40>:    rep movsl %ds:(%esi),%es:(%edi)   # memcpy 76 bytes and advance ESI, EDI

<+42>:    movzwl (%esi),%edx
<+45>:    mov    %dx,(%edi)        # copy another 2 bytes (not moving ESI or EDI)
 # final effect: 78-byte memcpy

On some (but not all) CPUs it would have been efficient to just use rep movsb or rep movsw with appropriate counts, but that's not what the compiler chose in this case. movzx aka AT&T movz is a good way to do narrow loads without partial-register penalties. That's why compilers do it, so they can write a full register even though they're only going to read the low 8 or 16 bits of that reg with a store instruction.

After that copy of a string literal into buf, we have a byte load/store that copies a character with buf. Remember at this point EAX is still pointing at buf, the malloc return value. So it's making a modified copy of the string literal.

<+48>:    movzbl 0x11(%eax),%edx
<+52>:    mov    %dl,0x10(%eax)             # buf[16] = buf[17]

Perhaps if the source hadn't defeated constant-propagation, with high enough optimization level the compiler might have just put the final string into .rodata where you could find it, trivializing this bomb phase. :P

Then it stores pointers as stack args for string compare.

<+55>:    mov    %eax,0x4(%esp)               # 2nd arg slot = EAX = buf
<+59>:    mov    0x20(%esp),%eax              #  function arg = user input?
<+63>:    mov    %eax,(%esp)                  # first arg slot = our incoming stack arg
<+66>:    call   0x80490ca <strings_not_equal>

How to "cheat": looking at the runtime result with GDB

Some bomb labs only let you run the bomb online, on a test server, which would record explosions. You couldn't run it under GDB, only use static disassembly (like objdump -drwC -Mintel). So the test server could record how many failed attempts you had. e.g. like CS 3330 at cs.virginia.edu that I found with google, where full credit requires less than 20 explosions.

Using GDB to examine memory / registers part way through a function makes this vastly easier than only working from static analysis, in fact trivializing this function where the single input is only checked at the very end. e.g. just look at what other arg is being passed to strings_not_equal. (Especially if you use GDB's jump or set $pc = ... commands to skip past the bomb explosion checks.)

Set a breakpoint or single-step to just before the call to strings_not_equal. Use p (char*)$eax to treat EAX as a char* and show you the (0-terminated) C string starting at that address. At that point EAX holds the address of the buffer, as you can see from the store to the stack.

Copy/paste that string result and you're done.

Other phases with multiple numeric inputs typically aren't this easy to cheese with a debugger and do require at least some math, but linked-list phases that requires you to have a sequence of numbers in the right order for list traversal also become trivial if you know how to use a debugger to set registers to make compares succeed as you get to them.


rep movsl copies 32-bit longwords from address %esi to address %edi, incrementing both by 4 each time, a number of times equal to %ecx. Think of it as memcpy(edi, esi, ecx*4).

See https://felixcloutier.com/x86/movs:movsb:movsw:movsd:movsq (it's movsd in Intel notation).

So this is copying 19*4=76 bytes.