Software vs hardware RAID performance and cache usage

Solution 1:

In short: if using a low-end RAID card (without cache), do yourself a favor and switch to software RAID. If using a mid-to-high-end card (with BBU or NVRAM), then hardware is often (but not always! see below) a good choice.

Long answer: when computing power was limited, hardware RAID cards had the significant advantage to offload parity/syndrome calculation for RAID schemes involving them (RAID 3/4/5, RAID6, ecc).

However, with the ever increasing CPU performance, this advantage basically disappeared: even my laptop's ancient CPU (Core i5 M 520, Westmere generation) has XOR performance of over 4 GB/s and RAID-6 syndrome performance over 3 GB/s over a single execution core.

The advantage that hardware RAID maintains today is the presence of a power-loss protected DRAM cache, in the form of BBU or NVRAM. This protected cache give very low latency for random write access (and reads that hit) and basically transform random writes into sequential writes. A RAID controller without such a cache is near useless. Moreover, some low-end RAID controllers do not only come without a cache, but forcibly disable the disk's private DRAM cache, leading to slower performance than without RAID card at all. An example are DELL's PERC H200 and H300 cards: if newer firmware has not changed that, they totally disable the disk's private cache (and it can not be re-enabled while the disks are connected to the RAID controller). Do a favor yourself and do not, ever, never buy such controllers. While even higher-end controller often disable disk's private cache, they at least have their own protected cache - making HDD's (but not SSD's!) private cache somewhat redundant.

This is not the end, though. Even capable controllers (the one with BBU or NVRAM cache) can give inconsistent results when used with SSD, basically because SSDs really need a fast private cache for efficient FLASH page programming/erasing. And while some (most?) controllers let you re-enable disk's private cache (eg: PERC H700/710/710P let the user re-enable it), if that private cache is not write-protected you risks to lose data in case of power loss. The exact behavior really is controller and firmware dependent (eg: on a DELL S6/i with 256 MB WB cache and enabled disk's cache, I had no losses during multiple, planned power loss testing), giving uncertainty and much concern.

Open source software RAIDs, on the other hand, are much more controllable beasts - their software is not enclosed inside a proprietary firmware, and have well-defined metadata patterns and behaviors. Software RAID make the (right) assumption that disk's private DRAM cache is not protected, but at the same time it is critical for acceptable performance - so they typically do not disable it, rather they use ATA FLUSH / FUA commands to be certain that critical data land on stable storage. As they often run from the SATA ports attached to the chipset SB, their bandwidth is very good and driver support is excellent.

However, if used with mechanical HDDs, synchronized, random write access pattern (eg: databases, virtual machines) will greatly suffer compared to an hardware RAID controller with WB cache. On the other hand, when used with enterprise SSDs (ie: with a powerloss protected write cache), software RAID often excels and give results even higher than what achievable with hardware RAID cards. That said you had to remember that consumer SSDs (read: with non-protected writeback cache), while very good at reading and async writing, deliver very low IOPS in synchronized write workloads.

Also consider that software RAIDs are not all created equal. Windows software RAID has a bad reputation, performance wise, and even Storage Space seems not too different. Linux MD Raid is exceptionally fast and versatile, but Linux I/O stack is composed of multiple independent pieces that you need to carefully understood to extract maximum performance. ZFS parity RAID (ZRAID) is extremely advanced but, if not correctly configured, can give you very poor IOPs; mirroring+striping, on the other side, performs quite well. Anyway, it need a fast SLOG device for synchronous write handling (ZIL).

Bottom line:

  1. if your workloads are not synchronized random write sensitive, you don't need a RAID card
  2. if you need a RAID card, do not buy a RAID controller without WB cache
  3. if you plan to use SSD software RAID is preferred but keep in mind that for high synchronized random writes you need a powerloss-protected SSD (ie: Intel S4600, Samsung PM/SM863, etc). For pure performance the best choice probably is Linux MD Raid, but nowadays I generally use striped ZFS mirrors. If you can not afford losing half the space due to mirrors and you needs ZFS advanced features, go with ZRAID but carefully think about your VDEVs setup.
  4. if you, even using SSD, really need an hardware RAID card, use SSDs with write-protected caches (Micron M500/550/600 have partial protection - not really sufficient but better than nothing - while Intel DC and S series have full power loss protection, and the same can be said for enterprise Samsung SSDs)
  5. if you need RAID6 and you will use normal, mechanical HDDs, consider to buy a fast RAID card with 512 MB (or more) WB cache. RAID6 has a high write performance penalty, and a properly-sized WB cache can at least provide a fast intermediate storage for small synchronous writes (eg: filesystem journal).
  6. if you need RAID6 with HDDs but you can't / don't want to buy a hardware RAID card, carefully think about your software RAID setup. For example, a possible solution with Linux MD Raid is to use two arrays: a small RAID10 array for journal writes / DB logs, and a RAID6 array for raw storage (as fileserver). On the other hand, software RAID5/6 with SSDs is very fast, so you probably don't need a RAID card for an all-SSDs setup.

Solution 2:

You'll want a battery or flash-backed cache solution for any hardware controller you purchase. Most regret not doing so.

But to answer your question, most controllers have configurable cache ratios... so 100% read cache and 0 % write cache negates the need for BBU protection. Your write performance will just suck.

I can't address your software RAID question because it depends. Linux MD RAID is different than Windows Software RAID, which is different than something like ZFS. Solutions like ZFS can perform better than hardware because they leverage the server's RAM and CPU resources.


Solution 3:

The RAID-controller you have your eye one is a cheap one and is basically a fakeraid. It even depends on your mainboard to provide some functions like memory and not a lot of mainboards have support for it which results in that you can't load the driver.

About HW vs SW-RAID itself. I'm not using HW-RAID anymore unless it is a box with an EMC logo on it for example. For everything else I just switched back to SW-RAID many moons again for a few very simple reasons.

  1. You need additional hardware and need to match them. You also need to match the firmware and keep that in sync. A lot of disks will not work correctly and you will spikes in your IO-latency for no clear reason.

  2. Additional hardware is expensive so you can use that additional $1000 (decent controller with two/three disks) for a small solution better. Invest it in more disks and standard controllers, ECC memory, faster CPU. And an on-site spare disk maybe if you plan to run it for a longer than the warranty period or don't want to pay the express fees for overnight shipping.

  3. Upgrading is a pain as you need to keep track of OS-patches and firmware for both disk and controller. It may result in a situation where upgrading/updating isn't possible anymore.

  4. On disk formats. Enough vendors use some in-house layout to store data that is tied to a revision of your hardware and firmware combination. This may result in a situation where a replacement part makes it for you impossible to access your data.

  5. It is an SPOF and a bottleneck. Having only one controller behind only one PCI-bridge doesn't gives you the performance and redundancy you really need. With this also comes to no migration path exists to migrate data to another diskset outside the controllers reach.

Most of these point have been taken care of with newer generations of SW-RAID software or solutions like ZFS and BtrFS. Keep in mind that in the end you want to protect your data and not fast accessible, but redundant garbage.


Solution 4:

I have spent the last year (off and on through 2014-2015) testing several parallel CentOS 6.6 RAID 1 (mirrored) configurations using 2 LSI 9300 HBA verses 2 LSI 9361-8i RAID controllers with systems built on the following: 2U Supermicro CSE-826BAC4-R920LPB chassis, a ASUS Z9PE-D16 motherboard, 2 Intel Xeon E5-2687W v2 Eight-Core 3.4 GHz Processors, mirrored Seagate ST6000NM0014 6TB SAS 12Gbs, 512 GB RAM. Note this is a fully SAS3 (12Gbps) compliant configuration.

I have scoured through articles written about tuning software and I have used Linux software RAID for over 10 years. When running basic I/O tests (dd-oflag=direct 5k to 100G files, hdparam -t, etc.), software RAID seems to stack up favorably to hardware raid. The software RAID mirrored through separate HBAs. I have gone as far as to do testing with the standard CentOS 6 kernel, kernel-lt and kernel-ml configurations. I have also tried various mdadm, file system, disk subsystem, and o/s tunings suggested by a variety of online articles written about Linux software RAID. Despite tuning, testing, tuning and testing, when running in a read world, transaction processing system (having a MySQL or Oracle database), I have found that running a hardware RAID controller results in a 50 times increase in performance. I attribute this to the hardware RAID optimized cache control.

For many, many months I was unconvinced that hardware RAID could be so much better, however, after exhaustive research on Linux software RAID, testing and tuning, those were my results.


Solution 5:

Most of the writers here are just ignorant of "write hole". This is the basis which allows for crying out for battey backup units of hardware RAIDs vs. absense of a such for software RAIDs. Well, for e. g., Linux software RAID implementation either supports bitmaps of write operations or does full "parity" re-calculation in case of not-clean shutdown. ZFS always strives to full-stripes-writes to avoid this inconsistency or postponing it's re-checking. So, as a summary, smart-enough software RAID nowadays is often good enough to be used instead of "who knows what's inside" so-called "hardware RAID".

As to the cache part of the question, it really doesn't matter so much, cause OS itself write cache can be much more bigger than "hardware" adapter has.