Linux raid disappears after reboot

This recipe worked for me after having the same issue. Looked all over the net trying to find the answer, and finally coming across this, and still no help.

The problem as I see it is multifold.

  1. mdadm reassigns the device files from /dev/md0 to something like /dev/md127 on the next reboot. So you cannot just use the device file in the fstab. I ended up using the UUID, from the created filesystem.

  2. Almost all the RAID drive setup tutorials on the web are showing the creation of the RAID device using the driver device files like this:

    mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd
    

    Instead I used the partition device files, like this:

    mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
    

    The second form requires proper creation of partitions on each disk using gdisk or fdisk. I used gdisk and assigned it as type fd00, which is a raid partition.

  3. There's lots of talk about needing to update /etc/mdadm/mdadm.conf. This is wrong. I purposefully, deleted that file. It's not needed. (See below)

That's really all there is too it. Full recipe follows...


Partition each drive with one partition of type fd00, Linux RAID:

root@teamelchan:~# gdisk /dev/sda
Command (? for help): n
Partition number (1-128, default 1):
First sector (2048-3907029134, default = 2048) or {+-}size{KMGTP}:
Last sector (2048-3907029134, default = 3907029134) or {+-}size{KMGTP}:
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300): fd00
Changed type of partition to 'Linux RAID'

Command (? for help): p
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Model: ST2000DM001-1ER1
Sector size (logical/physical): 512/4096 bytes
Disk identifier (GUID): F81E265F-2D02-864D-AF62-CEA1471CFF39
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 2048, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 0 sectors (0 bytes)

Number Start (sector) End (sector) Size Code Name
1
2048 3907029134 1.8 TiB FD00 Linux RAID

Command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!

Do you want to proceed? (Y/N): y
OK; writing new GUID partition table (GPT) to /dev/sda.
The operation has completed successfully.

Now you should see both the disk devices and partition devices in /dev

root@teamelchan:~# ls /dev/sd[a-d]*
/dev/sda /dev/sda1 /dev/sdb /dev/sdb1 /dev/sdc /dev/sdc1 /dev/sdd /dev/sdd1

Now create the RAID of your choice with mdadm using the partition device files, not the disk devices

root@teamelchan:~# mdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: chunk size defaults to 512K
mdadm: /dev/sda1 appears to contain an ext2fs file system
size=471724032K mtime=Sun Nov 18 19:42:02 2018
mdadm: /dev/sda1 appears to be part of a raid array:
level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018
mdadm: /dev/sdb1 appears to be part of a raid array:
level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018
mdadm: /dev/sdc1 appears to be part of a raid array:
level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018
mdadm: /dev/sdd1 appears to contain an ext2fs file system
size=2930265540K mtime=Sun Nov 18 23:58:02 2018
mdadm: /dev/sdd1 appears to be part of a raid array:
level=raid0 devices=4 ctime=Thu Nov 22 04:00:11 2018
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Now check in /dev/disk to see if there's any UUID associated with your new /dev/md0 RAID. There should be none.

root@teamelchan:~# ls -l /dev/disk/by-uuid
total 0
lrwxrwxrwx 1 root root 10 Nov 22 04:24 4777-FB10 -> ../../sdf1
lrwxrwxrwx 1 root root 10 Nov 22 04:24 D616BDCE16BDAFBB -> ../../sde1
lrwxrwxrwx 1 root root 10 Nov 22 04:24 e79571b6-eb75-11e8-acb0-e0d55e117fa5 -> ../../sdf2

Make the new filesystem, and after that you should now have a UUID with /dev/md0

root@teamelchan:~# mkfs.ext4 -F /dev/md0
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 2685945088 4k blocks and 335745024 inodes
Filesystem UUID: 7bd945b4-ded9-4ef0-a075-be4c7ea246fb
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848, 512000000, 550731776, 644972544, 1934917632,
2560000000

Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done

Voila, there it is.

root@teamelchan:~# ls -l /dev/disk/by-uuid
total 0
lrwxrwxrwx 1 root root 10 Nov 22 04:24 4777-FB10 -> ../../sdf1
lrwxrwxrwx 1 root root 9 Nov 22 04:43 7bd945b4-ded9-4ef0-a075-be4c7ea246fb -> ../../md0
lrwxrwxrwx 1 root root 10 Nov 22 04:24 D616BDCE16BDAFBB -> ../../sde1
lrwxrwxrwx 1 root root 10 Nov 22 04:24 e79571b6-eb75-11e8-acb0-e0d55e117fa5 -> ../../sdf2

Modify your /etc/fstab and add the mount for your new RAID Be sure to use the UUID, and not the device file.

root@teamelchan:~# cat /etc/fstab
UUID=e79571b6-eb75-11e8-acb0-e0d55e117fa5 / ext4 defaults 0 0
UUID=4777-FB10 /boot/efi vfat defaults 0 0
/swap.img none
swap sw 0 0
UUID=7bd945b4-ded9-4ef0-a075-be4c7ea246fb /md0/tweets ext4 auto 0 0

Here, look no /etc/mdadm/mdadm.conf It's not needed.

root@teamelchan:~# ls -l /etc/mdadm
total 0

Reboot

root@teamelchan:~# reboot
Connection to 192.168.0.131 closed by remote host.
Connection to 192.168.0.131 closed.

The RAID is mounted, but mdadm has renamed the device file from md0 to md127

Good thing we used the UUID and not the actual device file.

root@teamelchan:~# df /md0/tweets
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/md127 10658016696 73660 10120737636 1% /md0/tweets

Look md0 is gone from /dev

root@teamelchan:~# ls /dev/md*
/dev/md127

/dev/md:

That's it. Now I'm enjoying my 10 Terabyte RAID0 that operates at over 600 MB/sec

root@teamelchan:~# hdparm -tT /dev/md127

/dev/md127:
Timing cached reads: 26176 MB in 1.99 seconds = 13137.47 MB/sec
Timing buffered disk reads: 1878 MB in 3.00 seconds = 625.13 MB/sec

Your /proc/mdstat indicates that none of the RAID personalities (i.e. RAID1, RAID5, etc.) have been loaded, so no attempt is made to even try activating a RAID set.

Failed to start mdadm.service: Unit mdadm.service is masked.

This message indicates mdadm.service has been disabled in the strongest possible way: no explicit attempt will be made to start the service, and even if something else depends on this service, it won't be started.

As in the question linked by roaima, try running these commands as root:

dpkg-reconfigure mdadm    # Choose "all" disks to start at boot
update-initramfs -u       # Updates the existing initramfs

The first will reconfigure the mdadm package and should detect all the RAID sets and let you choose which RAID sets to auto-activate at boot: usually "all" is a good answer. This should also take care of the mdadm.service being masked, if I've understood correctly.

Once that is done, the second command will update your initramfs, so that the updated configuration files will be updated in your initramfs too, and the scripts that will be executed in the earliest phases of boot will get the information that there is a RAID set that should be activated.

Tags:

Linux

Raid