uniq command not working properly?

You need to use sort before uniq:

find . -type f -exec md5sum {} ';' | sort | uniq -w 33

uniq only removes repeated lines. It does not re-order the lines looking for repeats. sort does that part.

This is documented in man uniq:

Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use sort -u' withoutuniq'.


The input for uniq needs to be sorted. So for the example case,

find . -type f -exec md5sum '{}' ';' | sort | uniq -w 33

would work. The -w (--check-chars=N) makes the lines unique only regarding the first column; This option works for this case. but the possibilities to specify the relevant parts of the line for uniq are limited. For example, there are no options to specify working on some column 3 and 5, ignoring column 4.

The command sort has an option for unique output lines itself, and the lines are unique regarding the keys used for sorting. This means we can make use of the powerful key syntax of sort to define regarding which part the lines should be uniq.

For the example,

find . -type f -exec md5sum '{}' ';' | sort -k 1,1 -u

gives just the same result, but the sort part is more flexible for other uses.