How to disassemble, modify and then reassemble a Linux executable?

I don't think there is any reliable way to do this. Machine code formats are very complicated, more complicated than assembly files. It isn't really possible to take a compiled binary (say, in ELF format) and produce a source assembly program which will compile to the same (or similar-enough) binary. To gain an understanding of the differences, compare the output of GCC compiling direct to assembler (gcc -S) versus the output of objdump on the executable (objdump -D).

There are two major complications I can think of. Firstly, the machine code itself is not a 1-to-1 correspondence with assembly code, because of things like pointer offsets.

For example, consider the C code to Hello world:

int main()
{
    printf("Hello, world!\n");
    return 0;
}

This compiles to the x86 assembly code:

.LC0:
    .string "hello"
    .text
<snip>
    movl    $.LC0, %eax
    movl    %eax, (%esp)
    call    printf

Where .LCO is a named constant, and printf is a symbol in a shared library symbol table. Compare to the output of objdump:

80483cd:       b8 b0 84 04 08          mov    $0x80484b0,%eax
80483d2:       89 04 24                mov    %eax,(%esp)
80483d5:       e8 1a ff ff ff          call   80482f4 <printf@plt>

Firstly, the constant .LC0 is now just some random offset in memory somewhere -- it would be difficult to create an assembly source file which contains this constant in the correct place, since the assembler and linker are free to choose locations for these constants.

Secondly, I'm not entirely sure about this (and it depends on things like position independent code), but I believe the reference to printf is not actually encoded at the pointer address in that code there at all, but the ELF headers contain a lookup table which dynamically replaces its address at runtime. Therefore, the disassembled code doesn't quite correspond to the source assembly code.

In summary, source assembly has symbols while compiled machine code has addresses which are difficult to reverse.

The second major complication is that an assembly source file can't contain all of the information that was present in the original ELF file headers, like which libraries to dynamically link against, and other metadata placed there by the original compiler. It would be difficult to reconstruct this.

Like I said, it's possible that a special tool can manipulate all of this information, but it is unlikely that one can simply produce assembly code which can be reassembled back to the executable.

If you are interested in modifying just a small section of the executable, I recommend a much more subtle approach than recompiling the whole application. Use objdump to get the assembly code for the function(s) you are interested in. Convert it to "source assembly syntax" by hand (and here, I wish there was a tool that actually produced disassembly in the same syntax as the input), and modify it as you wish. When you are done, recompile just those function(s) and use objdump to figure out the machine code for your modified program. Then, use a hex editor to manually paste the new machine code over the top of the corresponding part of the original program, taking care that your new code is precisely the same number of bytes as the old code (or all the offsets would be wrong). If the new code is shorter, you can pad it out using NOP instructions. If it is longer, you may be in trouble, and might have to create new functions and call them instead.


I do this with hexdump and a text editor. You have to be really comfortable with the machine code and the file format storing it, and flexible with what counts as "disassemble, modify, and then reassemble".

If you can get away with making just "spot changes" (rewriting bytes, but not adding nor removing bytes), it'll be easy (relatively speaking).

You really don't want to displace any existing instructions, because then you'd have to manually adjust any effected relative offset within the machine code, for jumps/branches/loads/stores relative to the program counter, both in hardcoded immediate values and ones computed through registers.

You should always be able to get away with not removing bytes. Adding bytes might be necessary for more complex modifications, and gets a lot harder.

Step 0 (preparation)

After you've actually disassembled the file properly with objdump -D or whatever you normally use first to actually understand it and find the spots you need to change, you'll need to take note of the following things to help you locate the correct bytes to modify:

  1. The "address" (offset from the start of the file) of the bytes you need to change.
  2. The raw value of those bytes as they currently are (the --show-raw-insn option to objdump is really helpful here).

You'll also need to check if hexdump -R works on your system. If not, then for the rest of these steps, use the xxd command or similar instead of hexdump in all of the steps below (consult the documentation for whatever tool you use, I only explain hexdump for now in this answer because that is the one I am familiar with).

Step 1

Dump the raw hexadecimal representation of the binary file with hexdump -Cv.

Step 2

Open the hexdumped file and find the bytes at the address you're looking to change.

Quick crash course in hexdump -Cv output:

  1. The left-most column is the addresses of the bytes (relative to the start of the binary file itself, just like objdump provides).
  2. The right-most column (surrounded by | characters) is just "human readable" representation of the bytes - the ASCII character matching each byte is written there, with a . standing in for all bytes which don't map to an ASCII printable character.
  3. The important stuff is in between - each byte as two hex digits separated by spaces, 16 bytes per line.

Beware: Unlike objdump -D, which gives you the address of each instruction and shows the raw hex of the instruction based on how it's documented as being encoded, hexdump -Cv dumps each byte exactly in the order it appears in the file. This can be a little confusing as first on machines where the instruction bytes are in opposite order due to endianness differences, which can also be disorienting when you're expecting a specific byte as a specific address.

Step 3

Modify the bytes that need to change - you obviously need to figure out the raw machine instruction encoding (not the assembly mnemonics) and manually write in the correct bytes.

Note: You don't need to change the human-readable representation in the right-most column. hexdump will ignore it when you "un-dump" it.

Step 4

"Un-dump" the modified hexdump file using hexdump -R.

Step 5 (sanity check)

objdump your newly unhexdumped file and verify that the disassembly that you changed looks correct. diff it against the objdump of the original.

Seriously, don't skip this step. I make a mistake more often than not when manually editing the machine code and this is how I catch most of them.

Example

Here's a real-life worked example from when I modified an ARMv8 (little endian) binary recently. (I know, the question is tagged x86, but I don't have an x86 example handy, and the fundamental principles are the same, just the instructions are different.)

In my situation I needed to disable a specific "you shouldn't be doing this" hand-holding check: in my example binary, in objdump --show-raw-insn -d output the line I cared about looked like this (one instruction before and after given for context):

     f40:   aa1503e3    mov x3, x21
     f44:   97fffeeb    bl  af0 <error@plt>
     f48:   f94013f7    ldr x23, [sp, #32]

As you can see, our program is "helpfully" exiting by jumping into an error function (which terminates the program). Unacceptable. So we're going to turn that instruction into a no-op. So we're looking for the bytes 0x97fffeeb at the address/file-offset 0xf44.

Here is the hexdump -Cv line containing that offset.

00000f40  e3 03 15 aa eb fe ff 97  f7 13 40 f9 e8 02 40 39  |..........@...@9|

Notice how the relevant bytes are actually flipped (little endian encoding in the architecture applies to machine instructions like to anything else) and how this slightly unintuitively relates to what byte is at what byte offset:

00000f40  -- -- -- -- eb fe ff 97  -- -- -- -- -- -- -- --  |..........@...@9|
                      ^
                      This is offset f44, holding the least significant byte
                      So the *instruction as a whole* is at the expected offset,
                      just the bytes are flipped around. Of course, whether the
                      order matches or not will vary with the architecture.

Anyway, I know from looking at other disassembly that 0xd503201f disassembles to nop so that seems like a good candidate for my no-op instruction. I modified the line in the hexdumped file accordingly:

00000f40  e3 03 15 aa 1f 20 03 d5  f7 13 40 f9 e8 02 40 39  |..........@...@9|

Converted back into binary with hexdump -R, disassembled the new binary with objdump --show-raw-insn -d and verified that the change was correct:

     f40:   aa1503e3    mov x3, x21
     f44:   d503201f    nop
     f48:   f94013f7    ldr x23, [sp, #32]

Then I ran the binary and got the behavior I wanted - the relevant check no longer caused the program to abort.

Machine code modification successful.

!!! Warning !!!

Or was I successful? Did you spot what I missed in this example?

I am sure you did - since you're asking about how to manually modify the machine code of a program, you presumably know what you're doing. But for the benefit of any readers who might be reading to learn, I'll elaborate:

I only changed the last instruction in the error-case branch! The jump into the function that exits the program. But as you can see, register x3 was being modified by the mov just above! In fact, a total of four (4) registers were modified as part of the preamble to call error, and one register was. Here's the full machine code for that branch, starting from the conditional jump over the if block and ending where the jump goes to if the conditional if isn't taken:

     f2c:   350000e8    cbnz    w8, f48
     f30:   b0000002    adrp    x2, 1000
     f34:   91128442    add x2, x2, #0x4a1
     f38:   320003e0    orr w0, wzr, #0x1
     f3c:   2a1f03e1    mov w1, wzr
     f40:   aa1503e3    mov x3, x21
     f44:   97fffeeb    bl  af0 <error@plt>
     f48:   f94013f7    ldr x23, [sp, #32]

All of the code after the branch was generated by the compiler on the assumption that the program state was as it was before the conditional jump! But by just making the final jump to the error function code a no-op, I created a code path where we reach that code with inconsistent/incorrect program state!

In my case, this actually seemed to not cause any problems. So I got lucky. Very lucky: only after I already ran my modified binary (which, incidentally, was a security-critical binary: it had the capability to setuid, setgid, and change SELinux context!) did I realize that I forgot to actually trace the code paths of whether those register changes effected the code paths that came later!

That could've been catastrophic - any one of those registers might've been used in later code with the assumption that it contained a previous value that now got overwritten! And I'm the kind of person that people know for meticulous careful thought about code and as a pedant and stickler for always being conscientious of computer security.

What if I was calling a function where the arguments spilled from the registers onto the stack (as is very common on, for example, x86)? What if there was actually multiple conditional instructions in the instruction set that preceded the conditional jump (as is common on, for example, older ARM versions)? I would've been in an even more recklessly inconsistent state after having done that simplest-seeming change!

So this my cautionary reminder: Manually twiddling with binaries is literally stripping away every safety between you and what the machine and operating system will permit. Literally all the advances that we have made in our tools to automatically catch mistakes our programs, gone.

So how do we fix this more properly? Read on.

Removing Code

To effectively/logically "remove" more than one instruction, you can replace the first instruction you want to "delete" with an unconditional jump to the first instruction at the end of the "deleted" instructions. For this ARMv8 binary, that looked like this:

     f2c:   14000007    b   f48
     f30:   b0000002    adrp    x2, 1000
     f34:   91128442    add x2, x2, #0x4a1
     f38:   320003e0    orr w0, wzr, #0x1
     f3c:   2a1f03e1    mov w1, wzr
     f40:   aa1503e3    mov x3, x21
     f44:   97fffeeb    bl  af0 <error@plt>
     f48:   f94013f7    ldr x23, [sp, #32]

Basically, you "kill" the code (turn it into "dead code"). Sidenote: You can do something similar with literal strings embedded in the binary: so long as you want to replace it with a smaller string, you can almost always get away with overwriting the string (including the terminating null byte if it's a "C-string") and if necessary overwriting the hard-coded size of the string in the machine code that uses it.

You can also replace all unwanted instructions with no-ops. In other words, we can turn the unwanted code into what's called a "no-op sled":

     f2c:   d503201f    nop
     f30:   d503201f    nop
     f34:   d503201f    nop
     f38:   d503201f    nop
     f3c:   d503201f    nop
     f40:   d503201f    nop
     f44:   d503201f    nop
     f48:   f94013f7    ldr x23, [sp, #32]

I would expect that that's just wasting CPU cycles relative to jumping over them, but it is simpler and thus safer against mistakes, because you don't have to manually figure out how to encode the jump instruction including figuring out the offset/address to use in it - you don't have to think as much for a no-op sled.

To be clear, error is easy: I messed up two (2) times when manually encoding that unconditional branch instruction. And it's not always our fault: the first time was because the documentation I had was outdated/wrong and said one bit was ignored in the encoding, when it actually wasn't, so I set it to zero on my first try.

Adding Code

You could theoretically use this technique to add machine instructions too, but it's more complex, and I've never had to do it, so I don't have a worked example at this time.

From a machine code perspective it's sorta easy: pick one instruction at the spot you want to add code, and convert it into a jump instruction to the new code that you need add (don't forget to add the instruction(s) you thus replaced into the new code, unless you didn't need that for your added logic, and to jump back to the instruction you want to come back to at the end of the addition). Basically, you're "splicing" the new code in.

But you have to find a spot to actually put that new code, and this is the hard part.

If you're really lucky, you can just append the new machine code at the end of the file, and it'll "just work": the new code will get loaded along with the rest into the same expected machine instructions, into your address space space that falls into a memory page properly marked executable.

In my experience hexdump -R ignores not just the right-most column but the left-most column too - so you could literally just put zero addresses for all manually added lines and it'll work out.

If you're less lucky, after adding the code you'll have to actually adjust some header values within the same file: if the loader for your operating system expects the binary to contain metadata describing the size of the executable section (for historical reasons often called the "text" section) you'll have to find and adjust that. In the old days binaries were just raw machine code - nowadays the machine code is wrapped in a bunch of metadata (for example ELF on Linux and some others).

If you're still a little lucky, you might have some "dead" spot in the file which does get properly loaded up as part of the binary at the same relative offsets as the rest of the code that's already in the file (and that dead spot can fit your code and is properly aligned if your CPU requires word-alignment for CPU instructions). Then you can overwrite it.

If you're really unlucky you can't just append code and there is no dead space you can fill with your machine code. At that point, you basically have to be intimately familiar with the executable format and hope that you can figure out something within those constraints that is humanly feasible to pull off manually within a reasonable amount fo time and with a reasonable chance of not messing it up.


@mgiuca has correctly addressed this answer from a technical point of view. In fact, disassemblying an executable program into an easy-to-recompile assembly source is not an easy task.

To add some bits to the discussion, there are a couple of techniques/tools which could be interesting to explore, although they are technically complex.

  1. Static/Dynamic instrumentation. This technique entails analyzing the executable format, insert/delete/replace specific assembly instructions for a given purpose, fix all references to variables/functions in the executable, and the emit a new modified executable. Some tools which I know of are: PIN, Hijacker, PEBIL, DynamoRIO. Consider that configuring such tools to a purpose different from what they were designed for could be tricky, and requires understanding of both executable formats and instruction sets.
  2. Full executable decompilation. This technique tries to reconstruct a full assembly source from an executable. You might want to give a glance to the Online Disassembler, which tries to do the job. You lose anyhow information about different source modules and possibly functions/variable names.
  3. Retargetable decompilation. This technique tries to extract more information from the executable, looking at compiler fingerprints (i.e., patterns of code generated by known compilers) and other deterministic stuff. The main goal is to reconstruct higher-level source code, like C source, from an executable. This is sometimes able to regain information about functions/variables names. Consider that compiling sources with -g often offers better results. You might want to give the Retargetable Decompiler a try.

Most of this comes from vulnerbility assessment and execution analysis research fields. They are complex techniques and often the tools cannot be used immediately out of the box. Nevertheless, they provide invaluable help when trying to reverse engineer some software.