Windows Spanned Disks (LDM) restoration with Linux?

Here's the (much easier) answer, now that ldmtool exists.

ldmtool reads LDM (aka Windows Dynamic Disks) metadata, and (among other things) creates device-mapper entries for the corresponding drives, partitions, and RAID arrays, allowing you afterwards to access and mount them just like other block devices in Linux.

The program does have a few limitations, mostly borne from the fact that it does not modify LDM metadata at all. So you cannot create LDM disks in Linux (use Windows for that), and you should not mount in read-write mode RAID volumes that have disks missing. (ldmtool won't modify the metadata to reflect that this happened, and the next time Windows assembles the RAID array, problems will ensue, as not all the drives will be in sync.)

Here are the steps to follow:

  1. To install ldmtool on Debian and Ubuntu systems, type apt-get install ldmtool. It should be similarly easy on most other recent Linux distributions.
  2. Run ldmtool create all.
  3. You should now have a bunch of new entries in /dev/mapper. Locate the right one (in my case, a RAID1 array, so /dev/mapper/ldm_vol_VOLNAMEHERE-Dg0_Volume2), and just mount it with something like mount -t ntfs /dev/mapper/ldm_vol_VOLNAMEHERE-Dg0_Volume2.

To have this done automatically at boot time, you will likely need to insert a call to ldm create all at the right point in the boot sequence, before the contents of /etc/fstab is mounted. A good way of doing the call would be:

[ -x /usr/bin/ldmtool ] && ldmtool create all >/dev/null || true

But how to get this snippet to run at the right time during boot will vary a lot depending on the distribution you are using. For Ubuntu 13.10, I inserted said line in /etc/init/mountall.conf, right before the exec mountall ... call at the end of the script section. And I can now mount my Windows LDM RAID1 partition in /etc/fstab. Enjoy!


Allright, I will reply to my own question to avoid the same pain to others.

0. WARNING

In case you are doing a recovery, ALWAYS COPY YOUR DATA and work on the copy. Do NOT alter the original 'broken' data. That thing said, keep reading.

1. Your partition looks like ...

Install sleuth kit and testdisk. Hopefully there will packages for your distro :)

# mmls -t gpt LUN01
GUID Partition Table (EFI)
Offset Sector: 0
Units are in 512-byte sectors

    Slot    Start        End          Length       Description
00:  Meta    0000000000   0000000000   0000000001   Safety Table
01:  -----   0000000000   0000000033   0000000034   Unallocated
02:  Meta    0000000001   0000000001   0000000001   GPT Header
03:  Meta    0000000002   0000000033   0000000032   Partition Table
04:  00      0000000034   0000002081   0000002048   LDM metadata partition
05:  01      0000002082   0000262177   0000260096   Microsoft reserved partition
06:  02      0000262178   1048576966   1048314789   LDM data partition
07:  -----   1048576967   1048576999   0000000033   Unallocated

Note: testdisk will give you the same info with less details # testdisk /list LUN01

2. Extract disks metadata

All information about the disk order, data size and other ciphered attributes about the partition will be found in the LDM metadata partition. W2k8 has not changed so much since this document [2] albeit some sizes are different and some attributes are new (and obviously unknown)...

# dd if=LUN01 skip=33 count=2048 |xxd -a > lun01.metadata
# less lun01.metadata 

At line 0002410 you should see the name of the server. Reassuring ? But we are after the disks order and disk ID. Scroll down.

2.1. Disks Order

At line 0003210 you should see 'Disk1' followed by a long string.

0003200: 5642 4c4b 0000 001c 0000 0006 0000 0001  VBLK............
0003210: 0000 0034 0000 003a 0102 0544 6973 6b31  ...4...:...Disk1
0003220: 2437 3965 3830 3239 332d 3665 6231 2d31  $79e80293-6eb1-1
0003230: 3164 662d 3838 6463 2d30 3032 3662 3938  1df-88dc-0026b98
0003240: 3335 6462 3300 0000 0040 0000 0000 0000  35db3....@......
0003250: 0048 0000 0000 0000 0000 0000 0000 0000  .H..............

This means that the first disk of this Volume is identfied by the following Unique ID (UID) : 79e80293-6eb1-11df-88dc-0026b9835db3 But at the moment, we don't know which of the disk has this UID ! So move to the Disk2 entry and take note of its UID and so on for all the disks you had in your volume. Note: Based on my experience only the first 8 characters are changing, the rest stays the same. Indeed, W2k8 seems to increment the ID by 6. $ is a separator.

Eg. :

Windows Disk1 UID : 79e80293-6eb1-11df-88dc-0026b9835db3
Windows Disk2 UID : 79e80299-...
Windows Disk3 UID : 79e8029f-...

2.2. Find Disk UID

Go to line 00e8200 (lun01.metadata). You should find 'PRIVHEAD'.

00e8200: 5052 4956 4845 4144 0000 2c41 0002 000c  PRIVHEAD..,A....
00e8210: 01cc 6d37 2a3f c84e 0000 0000 0000 0007  ..m7*?.N........
00e8220: 0000 0000 0000 07ff 0000 0000 0000 0740  ...............@
00e8230: 3739 6538 3032 3939 2d36 6562 312d 3131  79e80299-6eb1-11
00e8240: 6466 2d38 3864 632d 3030 3236 6239 3833  df-88dc-0026b983
00e8250: 3564 6233 0000 0000 0000 0000 0000 0000  5db3............
00e8260: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00e8270: 3162 3737 6461 3230 2d63 3731 372d 3131  1b77da20-c717-11
00e8280: 6430 2d61 3562 652d 3030 6130 6339 3164  d0-a5be-00a0c91d
00e8290: 6237 3363 0000 0000 0000 0000 0000 0000  b73c............
00e82a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00e82b0: 3839 3164 3065 3866 2d64 3932 392d 3131  891d0e8f-d929-11
00e82c0: 6530 2d61 3861 372d 3030 3236 6239 3833  e0-a8a7-0026b983
00e82d0: 3564 6235 0000 0000 0000 0000 0000 0000  5db5............
00e82e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

What we are after is the disk UID of this particular disk. We see: - Disk Id : 79e80299-6eb1-11df-88dc-0026b9835db3 - Host Id : 1b77da20-c717-11d0-a5be-00a0c91db73c - Disk Group Id : 891d0e8f-d929-11e0-a8a7-0026b9835db5

So this disk with the UID 79e80299-... is Windows Disk2 but for us it was Physical Disk 1. Indeed find this UID in the disk order you found above. Note: There is no logical order. I mean Windows decide how to setup the disk order not you. So there is NO human logic and don't expect your first disk to be Disk1.

So don't assume that the order above is going to follow any human logic. I recommend you to go through all the LDM data of your disks and extract their UID. (You can use the following command to just extract the PRIVHEAD info: dd if=LUNXX skip=1890 count=1 |xxd -a)

e.g:

(Windows) Disk1 : 79e80293-... == Physical disk 2
(Windows) Disk2 : 79e80299-... == Physical disk 1
(Windows) Disk3 : 79e8029f-... == Physical disk 3

I am sure that somewhere in the LDM metadata you can find the type of Volume (spanned, RAID0, RAIDX, and the associated stripe sizes) However, I haven't dug it. I used a 'try and retry' method to find my data. So if you know how you setup your configuration before the drama, you will save yourself a lot of time.

3. Find the NTFS filesystem and your data

Now we are interested in the big chunk of data we want to restore. In my case it's ~512GB of data so we won't convert the whole in ASCII. I haven't really search how Windows find the beginning of its NTFS partition. But what I found is that it logically starts with the following keyword : R.NTFS. Let's find this and find the offset we will have to apply later to see our NTFS FS.

06:  02      0000262178   1048576966   1048314789   LDM data partition

In this example, the data starts at 262178 and is 1048314789 sectors long

We found above that Disk1 (of the volume group) is actually the 2nd physical disk. We will extract some of its information to find where the NTFS partition start.

# dd if=LUN02 skip=262178 count=4096 |xxd -a > lun02.DATASTART-4k
# less lun02.DATASTART-4k

0000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
*
00fbc00: eb52 904e 5446 5320 2020 2000 0208 0000  .R.NTFS    .....
00fbc10: 0000 0000 00f8 0000 3f00 ff00 0008 0400  ........?.......
00fbc20: 0000 0000 8000 8000 ffaf d770 0200 0000  ...........p....

Here we can see that NTFS starts at 00fbc00. So knowing that we can start to extract our data from sector 262178 + 00fbc00 bytes. Let's do a bit of hexadecimal to decimal conversion with bytes to sector conversion as well.

0xfbc00 bytes = 1031168 bytes = 1031168/512 sectors = 2014 sectors

So our NTFS partition starts at 262178 + 2014 = 264192 sectors. This value is going to be an offset we will use later on all disks. Let's called it the NTFS offset. Obviously the total size is shrinked by the offset. So the new size is: 1048314789 - 2014 = 1048312775 sectors

4. Try to mount/see the data

From now on, either it will work out of the box because your NTFS partition is healthy or it won't because you're doing this to recover some data. The following process is the same whatever is your status. All the following is based on [1] (see Links at the bottom)

A spanned volume, will fill a volume after another. Where as a striped (RAID0) will copy chunk of data over many disks (a.k.a a file is spread across many disks). In my case, I didn't know if it was a spanned or striped volume. The easiest way to know, if your volume is not full is to check if you have a lot of zeroes at then end of all your volumes. If that's the case then it's striped. Because if it's spanned, if will fill the first disk, then the second. I am not 100% sure of that but that's what I observed. So dd a bunch of sectors from the end of the LDM data partition.

4.0 Preparations to access your data

First mount your dd file or your device through a loopback device with the NTFS offset and the size we calculated above. However the offset and size must be in bytes not in sectors to be used with losetup. offset = 264192*512 = 135266304 size = 1048312775*512 = 536736140800

# losetup /dev/loop2 DDFILE_OR_DEVICE -o 135266304 --size 536736140800
# blockdev --getsize /dev/loop2
1048312775 <---- total size in sectors, same number than before

Note: you can add '-r' to mount in Read-Only mode.

Do the above for all the physical disks part of your volume. Display the result with: losetup -a Note: If you don't have enough loop devices you can easily create more with : # mknod -m0660 /dev/loopNUMBER b 7 NUMBER && chown root.disk /dev/loopNUMBER

Check your alignment by opening the first Disk of the group (eg: Disk2) to see if the first line is R.NTFS. If not then your alignment is wrong. Verify your calculations above and try again. Or you are not looking at the 1st Windows Disk

e.g:

First disk of the volume has been mounted on /dev/loop2 
# xxd /dev/loop2 |head
0000000: eb52 904e 5446 5320 2020 2000 0208 0000  .R.NTFS    ..... 
0000010: 0000 0000 00f8 0000 3f00 ff00 0008 0400  ........?.......

All good. Let's move to the annoying part :)

4.1 Spanned

Spanned disks are actually a chain of disks. You fill the first then you use the second one and so and so forth. Create a file which look like this, eg :

# Offset into   Size of this    Raid type       Device          Start sector
# volume        device                                          of device
0               1048312775  linear          /dev/loop2       0
1048312775      1048312775  linear          /dev/loop1       0
2096625550      1048312775  linear          /dev/loop3       0

Notes: - Remember to use the good disk order (you found before). eg: Physical Disk2 followed by Physical Disk1 and Physical Disk3 - 2096625550 = 2*1048312775 and obviously if you have a fourth disk it's gonna be 3 times the size for the offset for the 4th disk.

4.2 Striped

The problem with striped mode (aka RAID0) is you must know what is your stripe size. Apparently by default it is 64k (in my case it was 128k but I dunno if it was tuned by the Windows sysadmin:). Anyway if you don't know it, you just have to try all the possible standard values and see which one gives you a possible viable NTFS filesystem.

Create a file like the following for 3 disks with a 128k chunk size

                       .---+--> 3 chunks of 128k
0 3144938240  striped  3  128      /dev/loop2 0 /dev/loop3 0 /dev/loop1 0 
   `---> total size of the volume      `----------+-----------+---> disk order

/!\ : Size of the volume is not exactly the size we calculated before. dmsetup needs a volume size divisible by the chunk size (aka stripe size) AND by the number of disks in the volume. So in our case. We have 3 disks of 1048312775 sectors So the 'normal' size is 1048312775*3=3144938325 sectors but due to the above contraint we will recalculate the size and round it # echo "3144938325/128*128" | bc 3144938240 sectors

  So 3144938240 is the size of your volume in a striped scenario with 3 disk and
  128 chunks (aka stripes)

4.3 Mount it.

Now lets aggregate everything together with dmsetup :

# dmsetup create myldm /path/myconfigfile
# dmsetup ls
myldm       (253, 1)

# mount -t ntfs -o ro /dev/mapper/myldm /mnt 

If it does not mount. Then you can use testdisk :

# testdisk /dev/mapper/myldm
--> Analyse
----> Quick search
------> You should see the volume name (if any). If not it seems compromised :)
--------> Press 'P' to see files and copy with 'c'

5. Conclusion

The above worked for me. Your mileage may vary. And there is maybe a better and easier way to do it. If so, share it so nobody else will have to go through this hassle :) Also, it may look hard but it is not. As long as you copy your data somewhere, just try and retry until you can see something. It took me 3 days to understand how to put all the bits together. Hopefully the above will help you to not waste 3 days.

Note: All examples above have been made up. There is maybe some inconsistencies between the examples despite my thoroughness ;)

Good luck.

6. Links

  • [1] : http://www.kernel.org/doc/Documentation/filesystems/ntfs.txt
  • [2] : http://russon.org/ntfs/ldm/technical/index.html
  • [3] : http://svnweb.freebsd.org/base/stable/9/sys/geom/part/g_part_ldm.c
  • [4] : http://ntfs.com/ldm.htm
  • [5] : http://sourceforge.net/projects/linux-ntfs/files/LDM%20Documentation/