Generate md5 checksum for all files in a directory

You can pass md5sum multiple filenames or bash expansions:

$ md5sum * > checklist.chk  # generates a list of checksums for any file that matches *
$ md5sum -c checklist.chk   # runs through the list to check them
cron: OK
database.sqlite3: OK
fabfile.py: OK
fabfile.pyc: OK
manage.py: OK
nginx.conf: OK
uwsgi.ini: OK

If you want to get fancy you can use things like find to drill down and filter the files, as well as working recursively:

find -type f -exec md5sum "{}" + > checklist.chk

A great checksum creation/verification program is rhash. It creates even SFV compatible files, and checks them too.

It supports md4, md5, sha1, sha512, crc32 and many many other.

Moreover it can do recursive creation (-r option) like md5deep or sha1deep.

Last but not least you can format the output of the checksum file; for example:

rhash --md5 -p '%h,%p\n' -r /home/

outputs a CSV file including the full path of files recursively starting with the /home directory.

I find extremely useful even the -e option rename files by inserting crc32 sum into name.

You can change "md5sum" with "rhash" in the PhoenixNL72 examples.


Here are two more extensive examples:

  1. Create an md5 file in each directory which doesn't already have one, with absolute paths:

    find "$PWD" -type d | sort | while read dir; do [ ! -f "${dir}"/@md5Sum.md5 ] && echo "Processing " "${dir}" || echo "Skipped " "${dir}" " @md5Sum.md5 already present" ; [ ! -f "${dir}"/@md5Sum.md5 ] &&  md5sum "${dir}"/* > "${dir}"/@md5Sum.md5 ; chmod a=r "${dir}"/@md5Sum.md5;done 
    
  2. Create an md5 file in each folder which doesn't already have one: no paths, only filenames:

    find "$PWD" -type d | sort | while read dir; do cd "${dir}"; [ ! -f @md5Sum.md5 ] && echo "Processing " "${dir}" || echo "Skipped " "${dir}" " @md5Sum.md5 allready present" ; [ ! -f @md5Sum.md5 ] &&  md5sum * > @md5Sum.md5 ; chmod a=r "${dir}"/@md5Sum.md5 ;done 
    

What differs between 1 and 2 is the way the files are presented in the resulting md5 file.

The commands do the following:

  1. Build a list of directory names for the current folder. (Tree)
  2. Sort the folder list.
  3. Check in each directory if the file @md5sum.md5 exists. Output Skipped if it exists, output Processing if it doesn't exist.
  4. If the @md5Sum.md5 file doesn't exist, md5Sum will generate one with the checksums of all the files in the folder. 5) Set the generated @md5Sum.md5 file to read only.

The output of this entire script can be redirected to a file (.....;done > test.log) or piped to another program (like grep). The output will only tell you which directories where skipped and which have been processed.

After a successful run, you will end up with an @md5Sum.md5 file in each subdirectory of your current directory

I named the file @md5Sum.md5 so it'll get listed at the top of the directory in a samba share.

Verifying all @md5Sum.md5 files can be done by the next commands:

find "$PWD" -name @md5Sum.md5 | sort | while read file; do cd "${file%/*}"; md5sum -c @md5Sum.md5; done > checklog.txt

Afterwards you can grep the checklog.txt using grep -v OK to get a list of all files that differ.

To regenerate an @md5Sum.md5 in a specific directory, when you changed or added files for instance, either delete the @md5Sum.md5 file or rename it and run the generate command again.

Tags:

Command Line