Is cmp faster than diff -q?

Prompted by @josten, I ran a comparison on the two. The code is on GitHub. In short1:

user-sys real

The User+Sys time taken by cmp -s seemed to be a tad more than that of diff in most cases. However, the Real time take was pretty much arbitrary - cmp ahead on some, diff ahead on some.

Summary:

Any difference in performance is pure coincidence. Use whatever you wish.

1The images are 1920x450, so do open them in a tab to see them in their full glory.


Using similar, but larger files from Anthon (100M lines, with a difference only on the last one):

yes | head -n 100000000 >aa
sed '$ s/d/e/' >ab

I get indistinguishable timings for diff -q and cmp -s:

/tmp% time diff -q aa ab
Files aa and ab differ
diff -q aa ab  0.04s user 0.33s system 99% cpu 0.370 total
/tmp% time cmp -s aa ab
cmp -s aa ab  0.04s user 0.36s system 99% cpu 0.403 total

cmp is slower than cmp -s. Presumably counting the line numbers is a significant burden.

/tmp% time cmp aa ab
aa ab differ: char 499999999, line 100000000
cmp aa ab  0.84s user 0.36s system 97% cpu 1.225 total

This is on Debian wheezy amd64, all running from RAM (on tmpfs).

cmp -s has the advantage of being supported by all POSIX platforms and by BusyBox.


No, diff -q seems to be faster and you can easily test that:

$ wc x1 x2
 10000000  10000000  50000000 x1
 10000000  10000000  50000000 x2
 20000000  20000000 100000000 total

Two files with 10 million lines of 4 chars each.

$ cat x1 x2 > /dev/null
$ diff x1 x2
9999999c9999999
< abcd
---
> abce

Differing only in the one before last line.

$ time diff -q x1 x2
Files x1 and x2 differ

real    0m0.043s
user    0m0.012s
sys     0m0.031s

$ time cmp x1 x2
x1 x2 differ: byte 49999994, line 9999999

real    0m0.085s
user    0m0.048s
sys     0m0.036s

diff -q is almost twice as fast in real time, and stays faster that way when using repeated execution.