du wrongly reports empty directory

I can reproduce if the files are hard links:

~ mkdir foo bar
~ dd if=/dev/urandom of=bar/file1 count=1k bs=1k
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00985276 s, 106 MB/s
~ ln bar/file1 foo/file1
~ du -sh --apparent-size foo bar
1.1M    foo
4.0K    bar

This is expected behaviour. From the GNU du docs:

If two or more hard links point to the same file, only one of the hard links is counted. The file argument order affects which links are counted, and changing the argument order may change the numbers and entries that du outputs.

If you really need repeated sizes of hard links, try the -l option:

-l
--count-links
Count the size of all files, even if they have appeared already (as a hard link).

~ du -sh --apparent-size foo bar -l
1.1M    foo
1.1M    bar

Notice how the link count is 3 for the two files Lightroom 5 Catalog Linux.lrcat and zbackup.bat in Lightroom_catalog_from_win_backup.

This means that these two files are hard linked to (additional names for) other files somewhere. When you run du on a directory or a set of files, each hard link is only counted once.

Example:

$ ls -l
total 41024
-rw-r--r--  2 kk  wheel  10485760 Dec 17 09:07 file1
-rw-r--r--  2 kk  wheel  10485760 Dec 17 09:07 file2

$ du -h file1
10.0M   file1

$ du -h file2
10.0M   file2

$ du -h .
10.0M   .

This behaviour is explicitly mandated by the POSIX standard for the du utility:

A file that occurs multiple times under one file operand and that has a link count greater than 1 shall be counted and written for only one entry.

Some du implementations have non-standard options to disable this behaviour. For GNU du, this is done with the -l option.


It's almost certainly working correctly. du counts each file only once regardless of how many times it's referenced. It's probable that your two directories contain the same set of hard-linked files.

The man page for GNU du offers -l, --count-links to switch off this standard optimisation (see man du to check if your implementation includes this). Or you run du twice, once for each directory.

Tags:

Disk Usage