title | excerpt | updated |
---|---|---|
How to configure and rebuild software RAID |
Find out how to verify the state of your software RAID and rebuild it after a disk replacement |
2023-08-21 |
Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks.
The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space.
This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.
- A dedicated server{.external} with a software RAID configuration
- Administrative (sudo) access to the server via SSH
In a command line session, type the following code to determine the current RAID status:
cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sda2[1] sdb2[0]
931954688 blocks super 1.2 [2/2] [UU]
bitmap: 2/7 pages [8KB], 65536KB chunk
md4 : active raid1 sda4[0] sdb4[1]
1020767232 blocks super 1.2 [2/2] [UU]
bitmap: 0/8 pages [0KB], 65536KB chunk
unused devices: <none>
This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A _
would indicate a failed disk.
Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command:
fdisk -l
Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: HGST HUS724020AL
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F
Device Start End Sectors Size Type
/dev/sdb1 2048 4095 2048 1M BIOS boot
/dev/sdb2 4096 1864177663 1864173568 888.9G Linux RAID
/dev/sdb3 1864177664 1865226239 1048576 512M Linux filesystem
/dev/sdb4 1865226240 3907024895 2041798656 973.6G Linux RAID
Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: HGST HUS724020AL
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2E1DCCBA-8808-4D2B-BA33-9FEC3B96ADA8
Device Start End Sectors Size Type
/dev/sda1 2048 4095 2048 1M BIOS boot
/dev/sda2 4096 1864177663 1864173568 888.9G Linux RAID
/dev/sda3 1864177664 1865226239 1048576 512M Linux filesystem
/dev/sda4 1865226240 3907024895 2041798656 973.6G Linux RAID
/dev/sda5 3907025072 3907029134 4063 2M Linux filesystem
Disk /dev/md4: 973.5 GiB, 1045265645568 bytes, 2041534464 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/md2: 888.8 GiB, 954321600512 bytes, 1863909376 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
The fdisk -l
command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure.
For GPT partitions, the command will return: Disklabel type: gpt
.
Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: HGST HUS724020AL
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
'Disklabel type: gpt'
Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F
For MBR partitions, the command will return: Disklabel type: dos
.
Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors
Disk model: QEMU HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
'Disklabel type: dos'
Disk identifier: 0x150f6797
We can see that /dev/md2
consists of 888.8GB and /dev/md4
contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk.
mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755)
/dev/md2 on / type ext4 (rw,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340)
/dev/md4 on /home type ext3 (rw,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000)
Alternatively, the lsblk
command offers a different view of the partitions:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 888.9G 0 part
│ └─md2 9:2 0 888.8G 0 raid1 /
├─sda3 8:3 0 512M 0 part [SWAP]
├─sda4 8:4 0 973.6G 0 part
│ └─md4 9:4 0 973.5G 0 raid1 /home
└─sda5 8:5 0 2M 0 part
sdb 8:16 0 1.8T 0 disk
├─sdb1 8:17 0 1M 0 part
├─sdb2 8:18 0 888.9G 0 part
│ └─md2 9:2 0 888.8G 0 raid1 /
├─sdb3 8:19 0 512M 0 part [SWAP]
└─sdb4 8:20 0 973.6G 0 part
└─md4 9:4 0 973.5G 0 raid1 /home
As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove /dev/sda4
from the RAID with the following command:
umount /dev/md4
Warning
Please note that if you are connected as the user root
, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home):
umount: /home: target is busy
In this case, you must log out as the user root and connect as a local user (in our case debian
), and use the following command:
debian@ns000000:/$ sudo umount /dev/md4
If you do not have a local user, you need to create one.
This will provide us with the following output:
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755)
/dev/md2 on / type ext4 (rw,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000)
As we can see the, entry of /dev/md4
is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command:
sudo mdadm --fail /dev/md4 /dev/sda4
We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command:
sudo mdadm --remove /dev/md4 /dev/sda4
You can verify that the partition has been removed with the following command:
cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sda2[1] sdb2[0]
931954688 blocks super 1.2 [2/2] [UU]
bitmap: 4/7 pages [16KB], 65536KB chunk
md4 : active raid1 sdb4[1]
1020767232 blocks super 1.2 [2/1] [_U]
bitmap: 0/8 pages [0KB], 65536KB chunk
unused devices: <none>
The following command will verify that the partition has been removed:
mdadm --detail /dev/md4
/dev/md4:
Version : 1.2
Creation Time : Tue Jan 24 15:35:02 2023
Raid Level : raid1
Array Size : 1020767232 (973.48 GiB 1045.27 GB)
Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Jan 24 16:28:03 2023
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : md4
UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f
Events : 21
Number Major Minor RaidDevice State
- 0 0 0 removed
1 8 20 1 active sync /dev/sdb4
Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command:
For GPT partitions
sgdisk -R /dev/sda /dev/sdb
The command should be in this format: sgdisk -R /dev/newdisk /dev/healthydisk
Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks:
sgdisk -G /dev/sda
For MBR partitions
Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command:
sfdisk -d /dev/sdb | sfdisk /dev/sda
The command should be in this format: sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk
We can now rebuild the RAID array. The following code snippet shows how we can rebulid the /dev/md4
partition layout with the recently-copied sda partition table:
mdadm --add /dev/md4 /dev/sda4
cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sda2[1] sdb2[0]
931954688 blocks super 1.2 [2/2] [UU]
bitmap: 1/7 pages [4KB], 65536KB chunk
md4 : active raid1 sda4[0] sdb4[1]
1020767232 blocks super 1.2 [2/2] [UU]
bitmap: 0/8 pages [0KB], 65536KB chunk
unused devices: <none>
We can verify the RAID details with the following command:
mdadm --detail /dev/md4
/dev/md4:
Version : 1.2
Creation Time : Tue Jan 24 15:35:02 2023
Raid Level : raid1
Array Size : 1020767232 (973.48 GiB 1045.27 GB)
Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Jan 24 17:02:55 2023
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Rebuild Status : 21% complete
UUID : 7f39d062:9f16a016:a4d2adc2:26fd5302
Events : 0.95
Number Major Minor RaidDevice State
0 8 2 0 spare rebuilding /dev/sda4
1 8 18 1 active sync /dev/sdb4
The RAID has now been rebuilt, but we still need to mount the partition (/dev/md4
in this example) with the following command:
mount /dev/md4 /home
Join our community of users.