Comparing the contents of two directories

You can use the diff command just as you would use it for files:

diff <directory1> <directory2>

If you want to see subfolders and -files too, you can use the -r option:

diff -r <directory1> <directory2>

A good way to do this comparison is to use find with md5sum, then a diff.

Example

Use find to list all the files in the directory then calculate the md5 hash for each file and pipe it sorted by filename to a file:

find /dir1/ -type f -exec md5sum {} + | sort -k 2 > dir1.txt

Do the same procedure to the another directory:

find /dir2/ -type f -exec md5sum {} + | sort -k 2 > dir2.txt

Then compare the result two files with diff:

diff -u dir1.txt dir2.txt

Or as a single command using process substitution:

diff <(find /dir1/ -type f -exec md5sum {} + | sort -k 2) <(find /dir2/ -type f -exec md5sum {} + | sort -k 2)

If you want to see only the changes:

diff <(find /dir1/ -type f -exec md5sum {} + | sort -k 2 | cut -f1 -d" ") <(find /dir2/ -type f -exec md5sum {} + | sort -k 2 | cut -f1 -d" ")

The cut command prints only the hash (first field) to be compared by diff. Otherwise diff will print every line as the directory paths differ even when the hash is the same.

But you won't know which file changed...

For that, you can try something like

diff <(find /dir1/ -type f -exec md5sum {} + | sort -k 2 | sed 's/ .*\// /') <(find /dir2/ -type f -exec md5sum {} + | sort -k 2 | sed 's/ .*\// /')

This strategy is very useful when the two directories to be compared are not in the same machine and you need to make sure that the files are equal in both directories.

Another good way to do the job is using Git’s diff command (may cause problems when files has different permissions -> every file is listed in output then):

git diff --no-index dir1/ dir2/

Through you are not using bash, you can do it using diff with --brief and --recursive:

$ diff -rq dir1 dir2 
Only in dir2: file2
Only in dir1: file1

The man diff includes both options:

-q, --brief
report only when files differ

-r, --recursive
recursively compare any subdirectories found

Tags:

Command Line