Why does Zipping the same content twice gives two files with different SHA1?

By default, gzip saves file name and time stamp

%> gzip -help 2>&1 | grep -e '-n'
 -N --name            save or restore original file name and time stamp
 -n --no-name         don't save original file name or time stamp

%> gzip -V
Apple gzip 272

Using -n option:

%> tar cv foo/ | gzip -n > foo.tgz; shasum foo.tgz # sha256sum on Ubuntu

you will consistently get the same hash.

Try above without -n and you should see a different hash each time.


According to Wikipedia http://en.wikipedia.org/wiki/Zip_(file_format) seems that zip files have headers for File last modification time and File last modification date so any zip file checked into git will appear to git to have changed if the zip is rebuilt from the same content since. And it seems that there is no flag to tell it to not set those headers.

I am resorting to just using tar, it seems to produce the same bytes for the same input if run multiple times.

Tags:

Git

Ant

Zip

Gzip

Sha