ZFS - Is RAIDZ-1 really that bad?

Solution 1:

Before we go into specifics, consider your use case. Are you storing photos, MP3's and DVD rips? If so, you might not care whether you permanently lose a single block from the array. On the other hand, if it's important data, this might be a disaster.

The statement that RAIDZ-1 is "not good enough for real world failures" is because you are likely to have a latent media error on one of your surviving disks when reconstruction time comes. The same logic applies to RAID5.

ZFS mitigates this failure to some extent. If a RAID5 device can't be reconstructed, you are pretty much out of luck; copy your (remaining) data off and rebuild from scratch. With ZFS, on the other hand, it will reconstruct all but the bad chunk, and let the administrator "clear" the errors. You'll lose a file/portion of a file, but you won't lose the entire array. And, of course, ZFS's parity checking means that you will be reliably informed that there's an error. Otherwise, I believe it's possible (although unlikely) that multiple errors will result in a rebuild apparently succeeding, but giving you back bad data.

Since ZFS is a "Rampant Layering Violation," it also knows which areas don't have data on them, and can skip them in the rebuild. So if your array is half empty you're half as likely to have a rebuild error.

You can reduce the likelihood of these kinds of rebuild errors on any RAID level by doing regular "zpool scrubs" or "mdadm checks"of your array. There are similar commands/processes for other RAID's; e.g., LSI/dell PERC raid cards call this "patrol read." These go read everything, which may help the disk drives find failing sectors, and reassign them, before they become permanent. If they are permanent, the RAID system (ZFS/md/raid card/whatever) can rebuild the data from parity.

Even if you use RAIDZ2 or RAID6, regular scrubs are important.

One final note - RAID of any sort is not a substitute for backups - it won't protect you against accidental deletion, ransomware, etc. Although regular ZFS snapshots can be part of a backup strategy.

Solution 2:

There is a little bit of a misconception at work here. A lot of the advice you're seeing is based on an assumption which may not be true. Specifically, the unrecoverable bit error rate of your drive.

A cheap 'home user' disk has 1 per 10^14 unrecoverable error rate.


This is at a level where your're talking a significant likelihood of an unrecoverable error during a RAID rebuild, and so you shouldn't do it. (A quick and dirty calculation suggests that 5x 2TB disks RAID-5 set will actually have around a 60% chance of this)

However this isn't true for more expensive drives: http://www.seagate.com/gb/en/internal-hard-drives/enterprise-hard-drives/hdd/enterprise-performance-15k-hdd/#specs

1 per 10^16 is 100x better - meaning 5x 2TB is <1% chance of failed rebuild. (Probably less, because for enterprise usage, 600GB spindles are generally more useful).

So personally - I think both RAID-5 and RAID-4 are still eminently usable, for all the reasons RAID-0 is still fairly common. Don't forget - the problem with RAID-6 is it's hefty write penalty. You can partially mitigate this with lots of caching, but you've still got some pain built in, especially when you're working with slow drives in the first place.

And more fundamentally - NEVER EVER trust your RAID to give you full resilience. You'll lose data more often to an 'oops' than a drive failures, so you NEED a decent backup strategy if you care about your data anyway.

Solution 3:

Hmmm, some bad information here. For 4 disks, there's really nothing wrong with XFS. I tend to avoid ZFS RAIDZ for performance and expandability reasons (low reads/writes, can't be expanded). Use ZFS mirrors if you can. However, with 4 disks and nowhere to place your OS, you'll either lose a lot of capacity or have to go through odd partitioning games to fit your OS and data onto the same four disks.

I'd probably not recommend ZFS for your use case. There's nothing wrong with XFS here.