How to verify a file copy is reflink/CoW?

Good question. Looks like there aren't currently any easy high-level ways to tell.

One problem is that a file may only share part of the data via Copy-on-Write. This is called a physical extent, and some or all of the physical extents may be shared between CoW files.

There is nothing analogous to an inode which, when compared between files, would tell you that the files share the same physical extents. (Edit: see my other answer).

The low level answer is that you can ask the kernel which physical extents are used for the file using the FS_IOC_FIEMAP ioctl, which is documented in Documentation/filesystems/fiemap.txt. In principle, if all of the physical extents are the same, then the file must be sharing the same underlying storage.

Few things implement a way to look at this information at a higher level. I found some go code here. Apparently the filefrag utility is supposed to show the extents with -v. In addition, btrfs-debug-tree shows this information.

I would exercise caution however, since these things may have had little use in the wild for this purpose, you could find bugs giving you wrong answers, so beware relying on this data for deciding on operations which could cause data corruption.

Some related questions:

  • How to find out if a file on btrfs is copy-on-write?
  • How to find data copies of a given file in Btrfs filesystem?

Further to my previous answer, I have just released fienode which computes a SHA1 hash of the physical extents of the file and can be used to find some (identical) reflink copies. Beware though, there are caveats (see the documentation). BTRFS decided to change some, but not all, of the physical extents of a refink copy I made without provocation or warning, causing the value to change.