How "scrambled" is the data on a RAID5 disk?

Raid 5 stripes the data across the disks but the blocks used for striping are typically pretty big. At the very least they will bewhole sectors but normally they will much larger than that. For example madm defaults to half-megabyte chunks. Even one sector is big enough that you are likely to find recongisable chunks of text and with typical chunk sizes it is quite likely entire recognisable files will be present on the individual drives from the array.


In the interests of actually testing this, I pointed a copy of Foremost at a disk that was formerly part of a RAID-6 array (made available thanks to Seagate). The array had a chunk size of 512KB, so any file of 512KB or less is theoretically present intact. The data on the array is from nearly 25 years of computer use, including disk images of every computer I've owned.

The amount of data that I recovered was, frankly, scary. Word documents containing high-school homework assignments. Data files from games I'd uninstalled decades ago. DLL files from a hundred different versions of WINE. Images attached to unread Usenet posts. Ten thousand cached web pages. Adding a custom extraction rule found three SSL private keys and an SSH key.

Another thing to note is that you don't always need to extract the entire file to get compromising information. For example, the first 512k of a PDF can give you the table of contents, the first 512k of a BMP can give you a caption (BMP stores its image data upside-down), and the first 512k of a JPEG can give you a thumbnail. MPEG and MP3 files are designed to be streamable, so even a chunk from the middle of one can give someone useful data.

How scrambled is data on a RAID 5 disk? Not scrambled enough.


Sounds like people may be confusing drive sector size (typically 512B to 4KB) with RAID 5 stripe size (typically 16KB to 128KB, sometimes larger). The RAID stripe size is the logical writeable size for the array, so each part of the stripe on each drive will contain that much data. If an entire file fits into the stripe size, it will likely all be visible as a contiguous block on the remove drive.

Tags:

Linux

Physical