How to corrupt an archive file in a controlled way?

I haven't done much fuzz testing either, but here's two ideas:

Write some zeroes into the middle of the file. Use dd with conv=notrunc. This writes a single byte (block-size=1 count=1):

dd if=/dev/zero of=file_to_fuzz.zip bs=1 count=1 seek=N conv=notrunc

Using /dev/urandom as a source is also an option.

Alternatively, punch multiple-of-4k holes with fallocate --punch-hole. You could even fallocate --collapse-range to cut out a page without leaving a zero-filled hole. (This will change the file size).

A download resumed at the wrong place would match the --collapse-range scenario. An incomplete torrent will match the punch-hole scenario. (Sparse file or pre-allocated extents, either read as zero anywhere that hasn't been written yet.)

Bad RAM (in the system you downloaded the file from) can cause corruption, and optical drives can also corrupt files (their ECC isn't always strong enough to recover perfectly from scratches or fading of the dye).

DVD sectors (ECC blocks) are 2048B, but single byte or even single-bit errors can happen. Some drives will probably give you the bad uncorrectable data instead of a read-error for the sector, especially if you read in raw mode, or w/e it's called.


The other answers seems mostly concerned with hardware errors. Let me list some software-caused corruptions:

  • LF replaced with CRLF.
  • CR removed. (Even if not followed by LF)
  • Extra Null bytes inserted.
  • Extra Unicode "Byte Order Mark" inserted.
  • Character set converted from UTF-8 to Latin-1 or vice versa.
  • DOS EOF-character(#1A) deleted, even when not at End Of File.

These things are fairly harmless when happening to text files, but generally deadly when applied to binary files.


Use dd to truncate the file, or try a binary editor like hexer to edit and introduce some corruptions.

Example of truncating file using dd

Create 5MB file

# dd if=/dev/zero of=foo bs=1M count=5
5+0 records in
5+0 records out
5242880 bytes (5.2 MB) copied, 0.0243189 s, 216 MB/s
# ls -l foo
-rw-r--r-- 1 root root 5242880 Aug 12 20:13 foo
#

Truncate 10 bytes off the end

# dd if=foo of=foo-corrupted bs=1 count=5242870
5242870+0 records in
5242870+0 records out
5242870 bytes (5.2 MB) copied, 23.7826 s, 220 kB/s
# ls -l foo foo-corrupted
-rw-r--r-- 1 root root 5242880 Aug 12 20:13 foo
-rw-r--r-- 1 root root 5242870 Aug 12 20:14 foo-corrupted
#

Hexer man page

HEXER(1)                              General Commands Manual                             HEXER(1)

NAME
   hexer - binary file editor

SYNOPSIS
   hexer [options] [file [...]]

DESCRIPTION
   hexer  is  a  multi-buffer  editor  for  viewing  and  manipulating binary files.  It can't
   (shouldn't) be used for editing block devices, because it tries to load the whole file into
   a  buffer (it should work for diskettes).  The most important features of hexer are:  multi
   buffers, multi level undo, command line editing with completion, binary regular expressions
   (see  below).   The  user  interface  is  kept similar to vi, so if you know how to use vi,
   you'll get started easily.