Does recompiling a program produce a bit-for-bit identical binary?

  1. Compile same program with same settings on same machine:

    Although the definitive answer is "it depends", it is reasonable to expect that most compilers will be deterministic most of the time, and that the binaries produced should be identical. Indeed, some version control systems depend on this. Still, there are always exceptions; it is quite possible that some compiler somewhere will decide to insert a timestamp or some such (iirc, Delphi does, for example). Or the build process itself might do that; I've seen makefiles for C programs which set a preprocessor macro to the current timestamp. (I guess that would count as being a different compiler setting, though.)

    Also, be aware that if you statically link the binary, then you are effectively incorporating the state of all relevant libraries on your machine, and any change in any one of those will also affect your binary. So it is not just compiler settings which are relevant.

  2. Compile same program on a different machine with a different CPU.

    Here, all bets are off. Most modern compilers are capable of doing target-specific optimizations; if this option is enabled, then the binaries are likely to differ unless the CPUs are similar (and even then, it's possible). Also, see the above note about static linking: the configuration environment goes far beyond the compiler settings. Unless you have very strict configuration control, it's extremely likely that something differs between the two machines.


  • -frandom-seed=123 controls some GCC internal randomness. man gcc says:

    This option provides a seed that GCC uses in place of random numbers in generating certain symbol names that have to be different in every compiled file. It is also used to place unique stamps in coverage data files and the object files that produce them. You can use the -frandom-seed option to produce reproducibly identical object files.

  • __FILE__: put the source in a fixed folder (e.g. /tmp/build)

  • for __DATE__, __TIME__, __TIMESTAMP__:
    • libfaketime : https://github.com/wolfcw/libfaketime
    • override those macros with -D
    • -Wdate-time or -Werror=date-time: warn or fail if either __TIME__, __DATE__ or __TIMESTAMP__ are is used. The Linux kernel 4.4 uses it by default.
  • use the D flag with ar, or use https://github.com/nh2/ar-timestamp-wiper/tree/master to wipe stamps
  • -fno-guess-branch-probability: older manual versions say it is a source of non-determinism, but not anymore. Not sure if this is covered by -frandom-seed or not.

The Debian Reproducible builds project attempts to standardize Debian packages byte-by-byte, and recently got a Linux Foundation grant. That includes more than just compilation, but it should be of interest.

Buildroot has a BR2_REPRODUCIBLE option which may give some ideas on the package level, but it is far from complete at this point.

Related threads:

  • https://stackoverflow.com/questions/14653874/deterministic-binary-output-with-g
  • https://www.quora.com/What-can-be-the-possible-reasons-for-the-object-code-of-an-unchanged-C-file-to-change-on-recompilation

What your are asking is "is the output deterministic." If you compiled the program once, immediately compiled it again you would probably end up with the same output file. However, if anything changed - even a small change - especially in a component the compiled program uses, then the output of the compiler might also change.