This is a short introduction how to work with Linux RAID and LVM.
Please note: If the RAID array is to be built including SSDs (Solid State Disks) then there is also some general information on using SSDs under Linux.
Linux RAID provides redundancy of disks which increases the fault tolerance of storage systems and avoids data loss in case a disk drive fails.
LVM is a concept where several physical disks or even complete RAID arrays can be combined to provide one or more disk volumes. The advantage of LVM is that at any time, even during normal operation, LVM volumes can be changed in size, or disks can be added, removed, or replaced.
From the point of view of an LVM, a physical disk can be either a single disk drive, or a RAID array of disks. Using a RAID array of disks significantly increases the fault tolerance of the LVM volume.
Linux installation on a RAID1 and/or LVM can be cumbersome, depending on the Linux distribution, so it possibly makes sense to prepare at least a partition on the RAID1 array for booting and system installation first, and then select these partitions during installation instead of letting the installer create the partitions.
For example, at the time of this writing, openSUSE's YaST tool is unable to create a RAID1 array with a missing drive.
The mdadm
tool is used to manage Linux RAID arrays.
The index number of a RAID device, as in md0
, has to be unique. if md0
already exists, a different,
unused number has to be chosen for a new array.
/dev/sdx
/dev/md0
by specifying the 2 partitions /dev/sdx
and /dev/sdy
to be mirrored:mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdx1 /dev/sdy1
Assume there is an existing non-RAID drive /dev/sdx
, and a single new drive /dev/sdy
is available which should become part of the new RAID1 array:
/dev/sdy
/dev/sdy1
, and declare the other RAID component as missing
:mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 missing /dev/sdy1
/dev/sdx
to the RAID array or LVM volume, so /dev/sdx
contains no more any precious data/dev/sdx1
to the RAID array /dev/md0
to replace the missing
drive:mdadm --manage /dev/md0 --add /dev/sdx1
/dev/md0
/dev/md0p1
as usual/etc/fstab
so the partition can be mounted automatically
Assuing an existing RAID1 array /dev/md0
with /dev/sda1
and /dev/sdb1
, where the partitions (and thus the RAID array) should grow.
mdadm –fail /dev/md0 /dev/sda1 mdadm –remove /dev/md0 /dev/sda1
/dev/sda1
, then add the grown /dev/sda1
to the existing RAID:mdadm –add /dev/md0 /dev/sda1
/dev/sdb
:/dev/sdb
from /dev/md0
/dev/sda
to /dev/sdb
/dev/sdb1
to /dev/md0
If the array has a write-intent bitmap, it is strongly recommended that you remove the bitmap before increasing the size of the array. Failure to observe this precaution can lead to the destruction of the array if the existing bitmap is insufficiently large, especially if the increased array size necessitates a change to the bitmap's chunksize.
mdadm –grow /dev/mdX –bitmap none mdadm –grow /dev/mdX –size max mdadm –wait /dev/mdX mdadm –grow /dev/mdX –bitmap internal
If there is a partition on the RAID array /dev/md0
then this partition also need to be grown using a tool like parted
.
Finally the file system on the partition needs to be extended. First make sure the file system is consistent, then axtednd it. For ext
file systems:
fsck /dev/md0 resize2fs /dev/md0
If the RAID array is to become part of an LVM volume group, an LVM physical volume has to be created from the RAID array.
Some actions on a RAID array can only be taken after the RAID array has been stopped:
mdadm --stop /dev/md0
If the RAID array cant be stopped, a partition on the device has to be possibly unmounted first, or, if the RAID array is part of an LVM, the logical volume has to be deactivated before.
In some cases a RAID array needs to be renamed. For example, if the hostname is part of the array name, and has been changed.
In the example below the array is assembled as /dev/md125
, and its old internal name is mediaplayer:2
:
~ # mdadm --detail /dev/md125
/dev/md125:
Version : 1.0
Creation Time : Sun Oct 28 15:08:56 2012
Raid Level : raid1
Array Size : 5244916 (5.00 GiB 5.37 GB)
Used Dev Size : 5244916 (5.00 GiB 5.37 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Nov 3 23:50:33 2016
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : mediaplayer:2
UUID : d35131f9:adf82215:10b667d6:9fab3ed3
Events : 187
Number Major Minor RaidDevice State
0 8 98 0 active sync /dev/sdg2
1 8 50 1 active sync /dev/sdd2
We want to change the internal name to pc-martin:5
and assemble it as /dev/md/5
AKA /dev/md5
.
To do this, the array first has to be stopped, and then re-assembled with a new name:
~ # mdadm --stop /dev/md125 mdadm: stopped /dev/md125 ~ # mdadm --assemble /dev/md/5 --name=pc-martin:5 --update=name /dev/sdg2 /dev/sdd2 mdadm: /dev/md/5 has been started with 2 drives.
It is important that both the parameters –name=pc-martin:5
and –update=name
are given in the commands above.
~ # mdadm --detail /dev/md5
/dev/md5:
Version : 1.0
Creation Time : Sun Oct 28 15:08:56 2012
Raid Level : raid1
Array Size : 5244916 (5.00 GiB 5.37 GB)
Used Dev Size : 5244916 (5.00 GiB 5.37 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Nov 3 23:50:33 2016
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : pc-martin:5 (local to host pc-martin)
UUID : d35131f9:adf82215:10b667d6:9fab3ed3
Events : 187
Number Major Minor RaidDevice State
0 8 98 0 active sync /dev/sdg2
1 8 50 1 active sync /dev/sdd2
Finally an appropriate line has to added to or replaced in /etc/mdadm.conf
:
~ # mdadm --detail --brief /dev/md5 ARRAY /dev/md5 metadata=1.0 name=pc-martin:5 UUID=d35131f9:adf82215:10b667d6:9fab3ed3
In many cases it is important to know the type of a partition table on a disk.
If the partition table type is gpt, the tools gdisk
and sgdisk
are appropriate to work with the partition table.
If the partition table type is shown as dos or ms-dos, the tools fdisk
or sfdisk
have to be used.
The parted
tool and its graphical frontend gparted
support both gpt and dos partition tables.
They can be used to create partitions as well as to determine the partition table type.
The parted
tool can be used to display a disk's current partition table type, e.g.:
:~ # parted /dev/sda print Model: ATA OCZ-VERTEX3 (scsi) Disk /dev/sda: 240GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 223GB 223GB linux-swap(v1) Linux RAID raid 2 223GB 240GB 17.2GB linux-swap(v1) Linux swap
In the example above, the partition table type is gpt.
parted
toollinux raid
(type 0xFD)Add some example commands
Please note the sequence of device parameters differs for both commands, in both cases above /dev/sda
is the existing disk with a valid partition table and /dev/sdb
is the new disk to which the partition table is to be copied.
Copy an msdos-type partition table from /dev/sda
to /dev/sdb
:
sfdisk -d /dev/sda | sfdisk /dev/sdb
Copy a gpt-type partition table from /dev/sda
to /dev/sdb
:
sgdisk -R /dev/sdb /dev/sda
LVM distinguishes between different layers:
pv
) can be a single disk drive, or a whole RAID arrayvg
)lv
)For each layer there are different tools available:
pvdisplay
, pvcreate
etc. can be used to manage a physical LVM volume (pv
)vgdisplay
, vgcreate
etc. can be used to manage a LVM volume group (vg
) consisting of one or more physical volume(s)lvdisplay
, lvcreate
etc. can be used to manage a logical LVM volume (lv
) allocated in a volume group
Each physical disk or RAID array to be used with LVM has to be registered as Physical Volume (pv) first.
Assuming /dev/md1
is an existing RAID array used as physical volume which is to be replaced by a newly installed RAID array /dev/md2
:
pvcreate /dev/sda # Use a whole physical disk drive pvcreate /dev/md2 # Use a whole RAID array
pvdisplay can be used to display existing physical volumes, in the example below /dev/md0
is a pure RAID1 array from which the system boots, so it is not listed. The output shows an older volume /dev/md1
which already belongs to logical volume vg-data
, and a newly created /dev/md1
which doesn't belong to a volume group, yet:
pvdisplay — Physical volume — PV Name /dev/md1 VG Name vg-data PV Size 465.76 GiB / not usable 1.87 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 119234 Free PE 0 Allocated PE 119234 PV UUID NEfhlq-t3cH-ThOM-YGGV-uHWp-GzfL-23Agaj
/dev/md2
is a new physical volume of "953.74 GiB" size:
— NEW Physical volume — PV Name /dev/md2 VG Name PV Size 953.74 GiB Allocatable NO PE Size 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID kgj8w4-bU0c-5RAL-ETkY-Rnfk-arbh-TyNd8Q
vgcreate data /dev/md0
lvcreate -l 100%FREE -n data data
Add the new physical volume /dev/md2
to the existing volume group vg-data:
vgextend vg-data /dev/md2
Move all data from the old physical disk /dev/md1
which is to be removed to some other free space in the volume group, e.g. /dev/md2
. This may take quite some time to complete, depending on the disk sizes:
pvmove /dev/md1
Remove an old physical volume (could be a whole RAID1 array) /dev/md1
from the LVM.
Wipes the label on a device so that LVM will no longer recognize it as a physical volume:
pvremove /dev/md1
If the size of a logical volume is to be increased to the maximum size of the volume group e.g. after the volume group has been enlarged by a new physical volume then the following command can be used:
lvresize -r -l 100%VG /dev/vg-data/lv-data
The parameter -r
takes care that the size of the underlying file system in the volume group is also adjusted accordingly.
If the logical volume is to be shrunk, however, then the underlying file system needs to be shrunk first, before the logical volume is shrunk using the lvresize
command,
otherwise data may be lost. See man lvresize
for details.
lvchange -an /dev/…. # deactivate a logical volume lvchange -ay /dev/…. # activate a logical volume
The command
cat /proc/mdstat
can be used to monitor the state of all RAID arrays in a system. For example:
~ # cat /proc/mdstat Personalities : [raid1] md2 : active raid1 sdd1[0] sdb1[2] 1000072192 blocks super 1.2 [2/2] [UU] bitmap: 0/8 pages [0KB], 65536KB chunk md0 : active raid1 sdc1[0] sda1[2] 217508864 blocks super 1.2 [2/2] [UU] bitmap: 2/2 pages [8KB], 65536KB chunk unused devices: <none>
In the example output above md0
is a RAID1 array from which the system boots, md2
is a RAID array which is used as a physical volume in a volume group providing a logical volume used as data partition. The original RAID array for LVM was md1
, but this has been replaced by a newer, larger array named md2
, as described above.
The most important thing here is that both arrays are labelled [UU]
which indicates that the array is healthy. if the status code is [U_]
or [_U]
this means that an array drive is faulty or missing.
After a disk has been removed from a RAID array or from an LVM volume then signatures may still be available on the disk, so if the disk is re-used the old metadata may appear again.
To fix this the metadata can be removed from the disk before it is re-used.
The safest way is to boot a live system like partedmagic from an USB stick, with only the old disk connected.
The mdadm
program refuses to remove the RAID metadata from the partition if the RAID array is still running,
and the RAID array can only be stopped if is isn't in use e.g. by a LVM volume group.
So if live system has recognized an old logical volume vgdata
based on a RAID device /dev/md1
which
consisted of the partition /dev/sda1
, then the following actions need to be taken:
mdadm –zero-superblock /dev/sda1 wipefs –all /dev/sda
— Martin Burnicki martin.burnicki@burnicki.net, last updated 2022-11-30