How to compare two directories and delete duplicate files

Using fdupes:

fdupes --delete dir1 dir2

fdupes will not test on filename or file type, but will test on file size and contents (which implicitly includes file type).

Example:

$ mkdir dir1 dir2

$ touch dir{1,2}/{a,b,c}

$ tree
.
|-- dir1
|   |-- a
|   |-- b
|   `-- c
`-- dir2
    |-- a
    |-- b
    `-- c

2 directories, 6 files

$ fdupes --delete dir1 dir2
[1] dir1/a
[2] dir1/b
[3] dir1/c
[4] dir2/a
[5] dir2/b
[6] dir2/c

Set 1 of 1, preserve files [1 - 6, all]: 1

   [+] dir1/a
   [-] dir1/b
   [-] dir1/c
   [-] dir2/a
   [-] dir2/b
   [-] dir2/c

$ tree
.
|-- dir1
|   `-- a
`-- dir2

2 directories, 1 file

I have taken example of 2 directories p1 and p2

First i will save the output of p1 and p2 directories filenames to 2 output files

find /root/p1 -type f |awk -F "/" '{print $NF}'   > /var/tmp/P1_file.txt

 find /root/p2 -type f |awk -F "/" '{print $NF}'   > /var/tmp/P2_file.txt

Now i will find the common filenames in both directories and delete in one of directories. I wish you delete the duplicate files in /root/p1 and keep the files in /root/p2

awk 'NR==FNR {a[$1];next}($1 in a) {print $1}' /var/tmp/P1_file.txt /var/tmp/P2_file.txt  |awk '{print "rm -rvf" " " "/root/p1/"$1}' | sh

Tested and worked fine


I suggest you to use dircmp which exists on many Unixes.

See:

man dircmp

The -d option seems to be the one you might find the most appropriate:

dircmp -d dir1 dir2

will compare contents of dir1 and dir2 and display a diff like output.