Can I rely on these GitHub repository files?

Compilation is not a directly verifiable deterministic process across compiler versions, library versions, operating systems, or a number of other different variables. The only way to verify is to perform a diff at the assembly level. There are lots of tools that can do this but you still need to put the manual work in.


Polynomial tells you what may happen, and how to solve it. Here I will illustrate it:

I ran both binaries through strings and diffed them. That enough shows some completely harmless differences, in particular, the compiler used:

GCC: (Debian 6.3.0-18) 6.3.0 20170516                         | GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)
                                                              > GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)
                                                              > gcc 8.2.1 20181105

Some of the private names used are also different:

_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSEOS4_@ | _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSERKS4_

And some sections seem to be shuffled, so the diff cannot match them exactly.

Even on the same computer, without optimisation and -O3 shows different files:

_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6appendE | _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev

Even shuffling of internal data:

Diccionario creado!                                           <
MENU                                                          <
1. Generador de Diccionarios                                  <
0. Salir                                                      <
/***                                                          <
*    $$|  |$$ |$$|                                            <
*    $$|  |$$ |$$|                                              *    $$|  |$$ |$$|                                  
*    $$|  |$$ |$$|     $$| |$$  |$$$$$$|  |$$$$$$|              *    $$|  |$$ |$$|     $$| |$$  |$$$$$$|  |$$$$$$|  
*    $$$$$$$$ |$$|     $$| |$$ |$$ __ $$|  ____$$|              *    $$$$$$$$ |$$|     $$| |$$ |$$ __ $$|  ____$$|  
*    $$|  |$$ |$$|     $$| |$$ |$$|  |$$| $$$$$$$|              *    $$|  |$$ |$$|     $$| |$$ |$$|  |$$| $$$$$$$|  
*    $$|  |$$ |$$|___  $$|_|$$ |$$|  |$$| $$___$$|              *    $$|  |$$ |$$|___  $$|_|$$ |$$|  |$$| $$___$$|  
*    $$|  |$$ |$$$$$$$| $$$$$  |$$|  |$$| $$$$$$$|              *    $$|  |$$ |$$$$$$$| $$$$$  |$$|  |$$| $$$$$$$|  
*    ----------------------------------------------             *    ---------------------------------------------- 
                                                              > -------------------
                                                              > Diccionario creado!
                                                              > MENU
                                                              > 1. Generador de Diccionarios
                                                              > 0. Salir
                                                              > /*** 
                                                              > *    $$|  |$$ |$$| 

This proves that differing binary files raises many false positives, and doesn't tell you anything about is safety.

In this case, I'd use the version compiled by myself because you have no way to know what version is uploaded, as the author may have forgotten to recompile before the last tweaks.


If the software is exactly the same at source level, then the question boils down to whether you can trust your compiler, system libraries and various utilities which are used during compilation. If you installed your toolchain from a trusted source and you trust your computer wasn't compromised meanwhile, then there's no reason to suspect that the binary file that you generated will be malicious, even if it differs from the "reference" build.