What are the main points to avoid RAID5 with SSD?

Solution 1:

Your reasoning is correct, though you're missing the scale of the problem.

Enterprise SSDs are being made with higher endurance MLC cells, and can tolerate very high write-rates. SLC still blows high-endurance MLC out of the water, but in most cases the lifetime write-endurance of HE-MLC exceed the expected operational lifetime of a SSD.

These days, endurance is being listed as "Lifetime Writes" on spec-sheets.

As an example of this, the Seagate 600 Pro SSD line has a listing of this, roughly:

Model   Endurance
100GB       220TB
200GB       520TB
400GB      1080TB

Given a 5 year operational life, to reach the listed endurance for that 100GB drive, you need to write 123GB to that drive per day. That may be too little for you, which is why there are even higher endurance drives on the market. Stec, OEM provider for certain top-tier vendors, has drives listed for "10x full-drive writes for 5 years". These are all eMLC device.

Yes, R5 does incur a write amplification. However, it doesn't matter under most use-cases.


There is another issue here, as well. SSDs can take writes (and reads) so fast that the I/O bottleneck moves to the RAID controller. This was already the case with spinning metal drives, but is put into stark light when SSDs are involved. Parity computation is expensive, and you'll be hard pressed to get your I/O performance out of a R5 LUN created with SSDs.

Solution 2:

I found 2 research papers about this topic:

  1. Parity update increases write workload and space utilization

    Introduction

    [...] The results from our analytical model show that RAID5 is less reliable than striping with a small number of devices because of write amplification.

    Conclusion

    [...] Different factors such as the number of devices and the amount of data are explored, and the results imply that RAID5 is not universally beneficial in improving the reliability of SSD based systems

    Source: Don’t Let RAID Raid the Lifetime of Your SSD Array
    (Published 02/2012)

  2. Equal aging of all SSDs imposes risk of simultaneous failure (RAID1 & RAID6 affected too!)

    Abstract

    [...] Redundancy solutions such as RAID can potentially be used to protect against the high Bit Error Rate (BER) of aging SSDs. Unfortunately, such solutions wear out redundant devices at similar rates, inducing correlated failures as arrays age in unison. [...]

    5. Simulation Results

    [...] Conventional RAID-5 causes all SSDs age in lock-step fashion, and conventional RAID-4 does so with the data devices; as a result, the probability of data loss on an SSD failure climbs to almost 1 for both solutions as the array ages, and periodically resets to almost zero whenever all SSDs are replaced simultaneously. [...]

    Source: Differential RAID: Rethinking RAID for SSD Reliability
    (Published 03/2012)

    To protect from this the paper proposes a new RAID level called Diff-RAID that does automatically age-driven shuffling on device replacements).

    You can protect from this by manually checking the SSD wear out indicator and replacing drives proactively with spare discs so that at no time multiple discs have the same critical age.


Solution 3:

Parity RAID will thrash your $300 desktop SATA SSD. It will not even put a dent in a $3000 enterprise grade SSD.

It's all about what you're shopping for and what your use case is. SSD is a much more mature technology than it used to be. On the high end, their MTBF and max writes are approaching the same sort of reliability as mechanical HDDs.

One reason that you may not want to use parity RAID on SSD is that you can quickly saturate a backplane or controller bus with a large many-member SSD RAID group. There are diminishing returns very quickly with the read speed of high end SSDs and the bus/backplane bandwidth of current RAID controllers. Not to mention that if these are hosting data that is dished out over the network, it's entirely possible that your network interfaces will be the bottleneck before the disk IO is when you're talking about large SSD RAIDs.

Basically, write lifetime isn't that big of a deal unless you're building your "server" from Newegg, but there are some other reasons why you may be wasting money putting SSDs into large parity RAID sets.

Tags:

Ssd

Raid5