title | excerpt | updated |
---|---|---|
Managing hardware RAID (EN) |
Find out how to verify the state of your hardware RAID and the health of your hard drives |
2025-03-19 |
On a server with a hardware RAID configuration, the RAID array is managed by a physical component called a RAID controller.
- a dedicated server{.external} with a hardware RAID configuration
- administrative (sudo) access to the server via SSH
Warning
It is not advisable to reconfigure your RAID controller using MegaCli and lsiutil if you're unfamiliar with these tools, as you could risk losing your data. Please make a backup before making any changes.
Prior to verifying your RAID state, verify that you have a MegaRaid controller:
lspci | grep -i lsi | grep -i megaraid
03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
This confirms the server has a MegaRaid controller installed.
To gather and list available RAID arrays, you can use the MegaCli command:
MegaCli -LDInfo -Lall -aALL (Or : storcli /c0 /vall show)
Adapter 0 - Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 36.321 GB
Sector Size : 512
Mirror Data : 36.321 GB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Cached, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Cached, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: No
Virtual Drive: 1 (Target Id: 1)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 2.727 TB
Sector Size : 512
Mirror Data : 2.727 TB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Cached, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Cached, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Exit Code: 0x00
We can see two virtual drives which are composed of two physical hard drives each, so a total of four physical disks. In this case, the RAID status is "Optimal", which means the RAID is functioning correctly.
If the RAID status is "Degraded", we recommend that you verify the hard drive's state as well.
First, you must list the device Id for each drive in order to fully test them with smartmontools:
MegaCli -PDList -aAll | egrep 'Slot\ Number|Device\ Id|Inquiry\ Data|Raw|Firmware\ state' | sed 's/Slot/\nSlot/g' (Or : storcli /c0 /eall /sall show)
Slot Number: 0
Device Id: 4
Raw Size: 279.460 GB [0x22eec130 Sectors]
Firmware state: Online, Spun Up
Inquiry Data: BTWL3450062J300PGN INTEL SSDSC2BB300G4 D2010355
Slot Number: 1
Device Id: 5
Raw Size: 279.460 GB [0x22eec130 Sectors]
Firmware state: Online, Spun Up
Inquiry Data: BTWL345003X6300PGN INTEL SSDSC2BB300G4 D2010355
Slot Number: 2
Device Id: 7
Raw Size: 2.728 TB [0x15d50a3b0 Sectors]
Firmware state: Online, Spun Up
Inquiry Data: PN2234P8K2PKDYHGST HUS724030ALA640 MF8OAA70
Slot Number: 3
Device Id: 6
Raw Size: 2.728 TB [0x15d50a3b0 Sectors]
Firmware state: Online, Spun Up
Inquiry Data: PN2234P8JYP59YHGST HUS724030ALA640 MF8OAA70
With smartmontools' smartctl command, we will test each hard drive like this:
smartctl -d megaraid,N -a /dev/sdX
In this example, /dev/sda is the first RAID, and /dev/sdb is the second.
[!primary]
In some situations, you may receive this output:
/dev/sda [megaraid_disk_00] [SAT]: Device open changed type from 'megaraid' to 'sat'
You must then replace megaraid with sat+megaraid:
smartctl -d sat+megaraid,N -a /dev/sdX
Warning
If one of your hard drives is showing SMART errors, you should perform a full backup of your data as soon as possible and contact our support team. Our support team will need the slot number and device ID in order to identify the faulty disk.
To make sure, your RAID controller is working correctly, you can list all information with
MegaCli -AdpAllInfo -aALL
The most important section of the output is the error counter:
Error Counters
================
Memory Correctable Errors : 0
Memory Uncorrectable Errors : 0
If the counted errors are more than zero, you should create a backup of your data and contact the support with the full output. Then, the support will schedule an intervention for the replacement of the RAID controller.
For a succinct output of only the error counters, the command can be expanded by a grep:
MegaCli -AdpAllInfo -aALL | grep "Errors"
Memory Correctable Errors : 0
Memory Uncorrectable Errors : 0
If you had one or more hard drives replaced, the RAID will re-synchronise automatically. You can use this command to see which hard drives are currently rebuilding:
MegaCli -PDList -aAll | egrep 'Slot\ Number|Device\ Id|Inquiry\ Data|Raw|Firmware\ state' | sed 's/Slot/\nSlot/g' (Or : storcli /c0 /eall /sall show)
Slot Number: 0
Device Id: 4
Raw Size: 279.460 GB [0x22eec130 Sectors]
Firmware state: Online, Spun Up
Inquiry Data: BTWL3450062J300PGN INTEL SSDSC2BB300G4 D2010355
Slot Number: 1
Device Id: 5
Raw Size: 279.460 GB [0x22eec130 Sectors]
Firmware state: Online, Spun Up
Inquiry Data: BTWL345003X6300PGN INTEL SSDSC2BB300G4 D2010355
Slot Number: 2
Device Id: 7
Raw Size: 2.728 TB [0x15d50a3b0 Sectors]
Firmware state: Online, Spun Up
Inquiry Data: PN2234P8K2PKDYHGST HUS724030ALA640 MF8OAA70
Slot Number: 3
Device Id: 6
Raw Size: 2.728 TB [0x15d50a3b0 Sectors]
Firmware state: Rebuild
Inquiry Data: PN2234P8JYP59YHGST HUS724030ALA640 MF8OAA70
To monitor the progress of the rebuild operation, you can use this command:
MegaCli -PDRbld -ShowProg -PhysDrv [EncID:SlotID] -aALL (Or : storcli /c0/eEncID/sSlotID show rebuild)
The command will retrieve the enclosure ID and slot ID, as shown above.
[!primary]
CacheCade is a module from LSI used to improve random read performance of hard drives using an SSD as front caching device.
To verify the CacheCade's configuration, use the following command:
MegaCli -CfgCacheCadeDsply -a0 (Or : storcli /c0 /dall show cachecade)
To see which RAID array is associated with the CacheCade:
MegaCli -CfgCacheCadeDsply -a0 | grep "Associated LDs"
to receive a full list of status parameters for the BBU, use this command:
MegaCli -AdpBbuCmd -aALL
the most important value to check is if Battery State
is Optimal. If there are indicators of a failing battery, create a backup of your data and provide the outpout of this command to the support, when creating the Ticket.
Warning
This RAID controller card is deprecated and no longer available for new servers. It is gradually replaced by MegaRaid controllers.
Prior to verifying the RAID state, ensure that an LSI RAID controller card is installed with the following command:
lspci | grep -i lsi | grep -v megaraid
01:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2004 PCI-Express Fusion-MPT SAS-2 [Spitfire] (rev 03)
This confirms the presence of an LSI RAID controller.
[!primary]
The grep -v megaraid command removes the MegaRaid RAID controller card from the lspci output, as MegaRaid cards are made by LSI Corporation as well.
To gather and list available RAID arrays, you can use the lsiutil command:
Warning
Caution, the values (1,0 21) may differ depending on the version. Be very careful when handling this type of control.
lsiutil -p1 -a 1,0 21
LSI Logic MPT Configuration Utility, Version 1.63-OVH (27a4f9f54c)
1 MPT Port found
Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC
1. ioc0 LSI Logic SAS2004 03 200 13000000 0
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1
Volume 0 is DevHandle 011e, Bus 1 Target 0, Type RAID1 (Mirroring)
Volume Name:
Volume WWID: 0aaf504551c8efe5
Volume State: optimal, enabled, background init complete
Volume Settings: write caching disabled, auto configure hot swap enabled
Volume draws from Hot Spare Pools: 0
Volume Size 1906394 MB, 2 Members
Primary is PhysDisk 1 (DevHandle 0009, Bus 0 Target 0)
Secondary is PhysDisk 0 (DevHandle 000a, Bus 0 Target 1)
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 0
In the example above, we can see one virtual drive, which is composed of two physical hard drives. In this case, the RAID status is "Optimal", which means the RAID is functioning correctly.
If the RAID status is "Degraded", we recommend that you verify the hard drive's state as well.
[!primary]
In the case of a newly provisioned server, you may see this message: [In Progress: data scrub]. This message is not an error. Rather, it's an automated process generated by the controller's firmware in order to lower uncorrectable errors as much as possible.
To take a look at the hard drive's state from the RAID controller, you can use this command:
lsiutil -p1 -a 2,0 21
LSI Logic MPT Configuration Utility, Version 1.63-OVH (27a4f9f54c)
1 MPT Port found
Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC
1. ioc0 LSI Logic SAS2004 03 200 13000000 0
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 2
PhysDisk 0 is DevHandle 000a, Bus 0 Target 1
PhysDisk State: optimal
PhysDisk Size 1906394 MB, Inquiry Data: ATA HGST HUS724020AL AA70
Path 0 is DevHandle 000a, Bus 0 Target 1, online, primary
Path 1 is DevHandle 000a, invalid
PhysDisk 1 is DevHandle 0009, Bus 0 Target 0
PhysDisk State: optimal
PhysDisk Size 1906394 MB, Inquiry Data: ATA HGST HUS724020AL AA70
Path 0 is DevHandle 0009, Bus 0 Target 0, online, primary
Path 1 is DevHandle 0009, invalid
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 0
In this case both drives show as "Optimal".
Since the LSI card uses sg-map, we must test the /dev/sgX (X being the device number, like /dev/sg1, for example) corresponding to the hard drives in order to test them with smartmontools.
Here's how to list them:
cat /proc/scsi/scsi | grep Vendor
Vendor: LSI Model: Logical Volume Rev: 3000
Vendor: ATA Model: HGST HUS724020AL Rev: AA70
Vendor: ATA Model: HGST HUS724020AL Rev: AA70
Each line represents an sg device, which is mapped according to the order of the device shown here:
Vendor: LSI Model: Logical Volume Rev: 3000 => /dev/sg0
Vendor: ATA Model: HGST HUS724020AL Rev: AA70 => /dev/sg1
Vendor: ATA Model: HGST HUS724020AL Rev: AA70 => /dev/sg2
In order to list the right devices within one command, use the following:
cat /proc/scsi/scsi | grep Vendor | nl -v 0 | sed 's/^/\/dev\/sg/' | grep -v LSI | cut -d ' ' -f1,6 | sed 's/sg\ /sg/' | sed 's/\/dev\/sg.\ /\/dev\/sg/'
/dev/sg1
/dev/sg2
With smartmontools' smartctl command, we will test each hard drive, as shown below:
smartctl -a /dev/sgX
The sg device number is shown in the above command.
Warning
If one of your hard drives is showing SMART errors, you should perform a full backup of your data as soon as possible and contact our support team.
If you had one or more hard drives replaced, the RAID will re-synchronise automatically. To see if the RAID is in re-sync and monitor the resync progression, use this command:
Warning
Caution, the values (3,0 21) may differ depending on the version. Be very careful when handling this type of control.
lsiutil -p1 -a 3,0 21
LSI Logic MPT Configuration Utility, Version 1.63-OVH (27a4f9f54c)
1 MPT Port found
Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC
1. ioc0 LSI Logic SAS2004 03 200 13000000 0
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 3
Volume 0 is DevHandle 011e, Bus 1 Target 0, Type RAID1 (Mirroring)
Volume 0 State: degraded, enabled, resync in progress
Resync Progress: total blocks 624943104, blocks remaining 484024888, 77%
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 0
Warning
The percentage value shown in the command result is NOT the completion percentage. It is the remaining percentage.
[!alert]
This RAID controller card is deprecated. We highly recommend that you contact OVHcloud Support teams to schedule an intervention to replace the RAID controller with a MegaRaid controller, as 3ware RAID controllers are proven to be rather unstable. This type of intervention requires a reinstallation of your server. Be sure to backup your data first.
Configuring MegaRAID for RAID Level 0
Join our community of users.