Is there a filesystem that keeps only one copy of a file, and other copies are just references?

This feature is called deduplication. None of the popular Linux filesystems (ext*) support it, but apparently, ZFS supports it partially. There is also a table of filesystems listing, among others, deduplication, but there don’t appear to be any popular choices - it is a planned feature for Btrfs, though.

I would guess that periodically checking your filesystem and creating appropriate hard links is the best you can do at the moment, although that does not imply copy-on-write.


The primary keyword you want to look for is "copy on write." BTRFS does have a clone operation that does exactly what you want, and cp --reflink will do what you're looking for, provided your system has a modern enough kernel and coreutils 7.5. Wiki Source Also, bedup is a tool that will merge duplicates over an entire volume. CoW is also the driving feature underneath btrfs's snapshotting technology, IIRC.


There is an online file system S3QL designed for backups with great capacity of deduplication.