Proper way to deal with corrupt XFS filesystems

If you're attempting to run xfs_repair, getting the error message that suggests mounting the filesystem to replay the log, and after mounting still receiving the same error message, you may need to perform a forced repair (using the -L flag with xfs_repair). This option should be a last resort.

For example, I'll use a case where I had a corrupt root partition on my CentOS 7 install. When attempting to mount the partition, I continually received the below error message:

mount: mount /dev/mapper/centos-root on /mnt/centos-root failed: Structure needs cleaning

Unfortunately, forcing a repair would involve zeroing out (destroying) the log before attempting a repair. When using this method, there is a potential of ending up with more corrupt data than initially anticipated; however, we can use the appropriate xfs tools to see what kind of damage may be caused before making any permanent changes.

Using xfs_metadump and xfs_mdrestore, you can create a metadata image of the affected partition and perform the forced repair on the image rather than the partition itself. The benefits of this is the ability to see the damage that comes with a forced repair before performing it on the partition.

To do this, you'll need a decent sized USB or external hard drive. Start by mounting the USB drive - my USB was located at /dev/sdb1, yours may be named differently.

mkdir -p /mnt/usb
mount /dev/sdb1 /mnt/usb

Once mounted, run xfs_metadump to create a copy of the partition metadata to the USB - again, your affected partition may be different. In this case, I had a corrupt root partition located at /dev/mapper/centos-root:

xfs_metadump /dev/mapper/centos-root /mnt/usb/centos-root.metadump

Next, you'll want to restore the metadata in to an image so that we can perform a repair and measure the damage.

xfs_mdrestore /mnt/usb/centos-root.metadump /mnt/usb/centos-root.img

I found that in rescue mode xfs_mdrestore is not available, and instead you'll need to be in rescue mode of a live CentOS CD.

Finally, we can perform the repair on the image:

xfs_repair -L /mnt/usb/centos-root.img

After the repair has completed and you've assessed the output and potential damage, you can determine as to whether you'd like to perform the repair against the partition.

To run the repair against the partition, simply run:

xfs_repair -L /dev/mapper/centos-root

Don't forget to check the other partitions for corruption as well. After the repairs, reboot the system and you should be able to successfully boot.

Remember that the -L flag should be used as a last resort where there are no other possible options to repair.

I found that these online articles helped:

  • https://web.archive.org/web/20140920034637/http://geekblood.com/2014/08/13/filesystem-corruption-xfs-and-rhelv7/
  • https://web.archive.org/web/20160319163101/http://oss.sgi.com/archives/xfs/2015-01/msg00503.html
  • http://dhoytt.com/blog/2015/07/26/xfs-filesystem-repair-gets-web-server-back/