Detailed sparse file information on Linux

There is a similar question on SO. The currently accepted answer by @ephemient suggests using an ioctl called fiemap which is documented in linux/Documentation/filesystems/fiemap.txt. Quoting from that file:

The fiemap ioctl is an efficient method for userspace to get file extent mappings. Instead of block-by-block mapping (such as bmap), fiemap returns a list of extents.

Sounds like this is the kind of information you're looking for. Support by filesystems is again optional:

File systems wishing to support fiemap must implement a ->fiemap callback on their inode_operations structure.

Support for the SEEK_DATA and SEEK_HOLE arguments to lseek you mentioned from Solaris was added in Linux 3.1 according to the man page, so you might use that as well. The fiemap ioctl appears to be older, so it might be more portable across different Linux versions for now, whereas lseek might be more portable across operating systems if Solaris has the same.

There is a collection of python programs called sparseutils that use SEEK_HOLE and SEEK_DATA to determine which sections of the file are represented as holes and which are data. Usage is quite straightforward. mksparse can be used to generate a sparse file according to some given layout.

 $ echo hole,data,hole | mksparse --hole-size 4096 --data-size 4096 example
 $ du -sh example
 4.0K   example

The sparsemap program can be used to print the layout to stdout:

 $ sparsemap example
 HOLE 4096
 DATA 4096
 HOLE 4096