Recursively compare two directories with diff -r without output on broken links

For version 3.3 or later of diff, you should use the --no-dereference option, as described in Pete Harlan's answer.

Unfortunately, older versions of diff don't support ignoring symlinks:

Some files are neither directories nor regular files: they are unusual files like symbolic links, device special files, named pipes, and sockets. Currently, diff treats symbolic links like regular files; it treats other special files like regular files if they are specified at the top level, but simply reports their presence when comparing directories. This means that patch cannot represent changes to such files. For example, if you change which file a symbolic link points to, diff outputs the difference between the two files, instead of the change to the symbolic link.

diff should optionally report changes to special files specially, and patch should be extended to understand these extensions.

If all you want is to verify an rsync (and presumably fix what's missing), then you could just run the rsync command a second time. If you don't want to do that, then check-summing the directory may be sufficient.

If you really want to do this with diff, then you can use find to skip the symlinks, and run diff on each file individually. Pass your directories a and b in as arguments:

#!/bin/bash
# Skip files in $1 which are symlinks
for f in `find $1/* ! -type l`
do
    # Suppress details of differences
    diff -rq $f $2/${f##*/}
done

or as a one-liner:

for f in `find a/* ! -type l`;do diff -rq $f b/${f##*/};done

This will identify files that differ in content, or files which are in a but not in b.

Note that:

  • since we are skipping symlinks entirely, this won't notice if symlink names are not present in b. If you required that, you would need a second find pass to identify all the symlinks and then explicitly check for their existence in b.
  • Extra files in b will not be identified, since the list is constructed from the contents of a. This probably isn't a problem for your rsync scenario.

Since version 3.3 GNU diff supports not dereferencing symlinks, but then compares the paths they point to.

Install GNU diffutils >= 3.3 and use the --no-dereference option; there is no short option for that.

Diagnostic will be silent if equal or:

Symbolic links /tmp/noderef/a/symlink and /tmp/noderef/b/symlink differ


You can use a newer version of diff

The diff in GNU diffutils 3.3 includes a --no-dereference option that allows you to compare the symlinks themselves rather than their targets. It reports if they differ, is quiet if they agree and doesn't care whether they're broken.

I don't know when the option was added; it isn't present in 2.8.1.

Tags:

Diff