Are zip files vulnerable to corruption?

Are zip files vulnerable to corruption?

Yes, which is why a good backup schema verifies the newly created backup file matches the content of the source file, and also that multiple copies to different media are made, each verified.

Good backup includes verification and redundancy. That's why most backup schema recommend multiple copies, with at least one copy offsite, whether in the cloud or physically transported offsite. That resolves the small chance of bit rot.

The 7-Zip Open Source package, one of the many programs which can make and open ZIP files, includes recovery instructions, but their language regarding your chance of recovery, you will notice, is guarded.

Your chance of recovery also depends on where the corruption is; if it's in the dictionary, everything in the ZIP file is toast, which is why modern ZIP files have two copies of that dictionary.

ZIP and 7Z files should not be used to back up Linux and UNIX files, as (unlike Windows) the ownership and group data for each individual file stored within the ZIP and 7Z archive is not preserved if the ZIP file is created from Linux or UNIX. That's why Linux and UNIX backups archive first to a TAR file to preserve that data, then compress the TAR file.


In general if a compressed data stream is corrupted it is not possible for the decompressor to recover, so all data after the point of corruption is likely to be lost.

zip compressses each file individually, so the chances are that if a zip file is corrupted only one file will be affected. zips have a central directory, if this is corrupted then it may not be possible to extract the files using normal unzip tools, however it should still be possible to recover them using zip file recovery tools that search for the individual file headers (traditionally on dos this was done with a program called pkzipfix, I'm not sure if there are more modern alternatives).

Note that many other archive formats use "solid" compression (either all the time or as an option). In a solid archive the files are combined into a single data stream before compression, and therefore in such an archive format any corruption will likely destroy all files after the file that is directly affected.