Linux RAID and LVM Setup
This is a short introduction how to work with Linux RAID and LVM.
Please note: If the RAID array is to be built including SSDs (Solid State Disks) then there is also some general information on using SSDs under Linux.
Linux RAID provides redundancy of disks which increases the fault tolerance of storage systems and avoids data loss in case a disk drive fails.
LVM is a concept where several physical disks or even complete RAID arrays can be combined to provide one or more disk volumes. The advantage of LVM is that at any time, even during normal operation, LVM volumes can be changed in size, or disks can be added, removed, or replaced.
From the point of view of an LVM, a physical disk can be either a single disk drive, or a RAID array of disks. Using a RAID array of disks significantly increases the fault tolerance of the LVM volume.
- Linux can be booted from a RAID1 array, but not from an LVM volume, so the boot partition should be located on a RAID1 array, not on an LVM volume.
- LVM volumes can easily be increased in size by adding new disks, or replacing existing disks by newer, larger ones, so they can well be used for large data storage partitions.
Linux installation on a RAID1 and/or LVM can be cumbersome, depending on the Linux distribution, so it possibly makes sense to prepare at least a partition on the RAID1 array for booting and system installation first, and then select these partitions during installation instead of letting the installer create the partitions.
For example, at the time of this writing, openSUSE's YaST tool is unable to create a RAID1 array with a missing drive.
Working With a RAID Array
The mdadm
tool is used to manage Linux RAID arrays.
The index number of a RAID device, as in md0
, has to be unique. if md0
already exists, a different,
unused number has to be chosen for a new array.
Setting Up a RAID1 Array From 2 New Disks
- Prepare a new Partition on
/dev/sdx
- Create a new RAID1 device
/dev/md0
by specifying the 2 partitions/dev/sdx
and/dev/sdy
to be mirrored:
mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdx1 /dev/sdy1
Converting an Existing Disk Drive Into a new RAID1 Array
Assume there is an existing non-RAID drive /dev/sdx
, and a single new drive /dev/sdy
is available which should become part of the new RAID1 array:
- Prepare a new partition on the new disk
/dev/sdy
- Create a new RAID1 array using only the new partition
/dev/sdy1
, and declare the other RAID component asmissing
:
mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 missing /dev/sdy1
- Partition the new RAID array, setup new LVM volume, or make it a physical LVM volume and add the new physical LVM volume to an existing LVM volume group
- Copy all data from
/dev/sdx
to the RAID array or LVM volume, so/dev/sdx
contains no more any precious data - Add
/dev/sdx1
to the RAID array/dev/md0
to replace themissing
drive:
mdadm --manage /dev/md0 --add /dev/sdx1
Using A New RAID Array Standalone
- Create an new partition on the RAID device
/dev/md0
- Format and mount the new partition
/dev/md0p1
as usual - Add the mounting information to
/etc/fstab
so the partition can be mounted automatically - Done
Growing an Existing RAID ARRAY
Assuing an existing RAID1 array /dev/md0
with /dev/sda1
and /dev/sdb1
, where the partitions (and thus the RAID array) should grow.
mdadm –fail /dev/md0 /dev/sda1 mdadm –remove /dev/md0 /dev/sda1
- Resize or re-create partition
/dev/sda1
, then add the grown/dev/sda1
to the existing RAID:
mdadm –add /dev/md0 /dev/sda1
- Wait until resync complete
- Then update partition on
/dev/sdb
:- Remove
/dev/sdb
from/dev/md0
- Copy partition table from
/dev/sda
to/dev/sdb
- Add new partition
/dev/sdb1
to/dev/md0
- Wait until resync complete
If the array has a write-intent bitmap, it is strongly recommended that you remove the bitmap before increasing the size of the array. Failure to observe this precaution can lead to the destruction of the array if the existing bitmap is insufficiently large, especially if the increased array size necessitates a change to the bitmap's chunksize.
mdadm –grow /dev/mdX –bitmap none mdadm –grow /dev/mdX –size max mdadm –wait /dev/mdX mdadm –grow /dev/mdX –bitmap internal
If there is a partition on the RAID array /dev/md0
then this partition also need to be grown using a tool like parted
.
Finally the file system on the partition needs to be extended. First make sure the file system is consistent, then axtednd it. For ext
file systems:
fsck /dev/md0 resize2fs /dev/md0
Setting up a new LVM Volume From a RAID1 Array
If the RAID array is to become part of an LVM volume group, an LVM physical volume has to be created from the RAID array.
Stopping a RAID Array
Some actions on a RAID array can only be taken after the RAID array has been stopped:
mdadm --stop /dev/md0
If the RAID array cant be stopped, a partition on the device has to be possibly unmounted first, or, if the RAID array is part of an LVM, the logical volume has to be deactivated before.
Renaming a RAID Array
In some cases a RAID array needs to be renamed. For example, if the hostname is part of the array name, and has been changed.
In the example below the array is assembled as /dev/md125
, and its old internal name is mediaplayer:2
:
~ # mdadm --detail /dev/md125
/dev/md125:
Version : 1.0
Creation Time : Sun Oct 28 15:08:56 2012
Raid Level : raid1
Array Size : 5244916 (5.00 GiB 5.37 GB)
Used Dev Size : 5244916 (5.00 GiB 5.37 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Nov 3 23:50:33 2016
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : mediaplayer:2
UUID : d35131f9:adf82215:10b667d6:9fab3ed3
Events : 187
Number Major Minor RaidDevice State
0 8 98 0 active sync /dev/sdg2
1 8 50 1 active sync /dev/sdd2
We want to change the internal name to pc-martin:5
and assemble it as /dev/md/5
AKA /dev/md5
.
To do this, the array first has to be stopped, and then re-assembled with a new name:
~ # mdadm --stop /dev/md125 mdadm: stopped /dev/md125 ~ # mdadm --assemble /dev/md/5 --name=pc-martin:5 --update=name /dev/sdg2 /dev/sdd2 mdadm: /dev/md/5 has been started with 2 drives.
It is important that both the parameters –name=pc-martin:5
and –update=name
are given in the commands above.
~ # mdadm --detail /dev/md5
/dev/md5:
Version : 1.0
Creation Time : Sun Oct 28 15:08:56 2012
Raid Level : raid1
Array Size : 5244916 (5.00 GiB 5.37 GB)
Used Dev Size : 5244916 (5.00 GiB 5.37 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Nov 3 23:50:33 2016
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : pc-martin:5 (local to host pc-martin)
UUID : d35131f9:adf82215:10b667d6:9fab3ed3
Events : 187
Number Major Minor RaidDevice State
0 8 98 0 active sync /dev/sdg2
1 8 50 1 active sync /dev/sdd2
Finally an appropriate line has to added to or replaced in /etc/mdadm.conf
:
~ # mdadm --detail --brief /dev/md5 ARRAY /dev/md5 metadata=1.0 name=pc-martin:5 UUID=d35131f9:adf82215:10b667d6:9fab3ed3
Detailed Commands for Disk Partitions and RAID
Partition Table Types
In many cases it is important to know the type of a partition table on a disk.
- gpt is new partition table type which can be used for large disks, and is supported by UEFI boot
- dos (sometimes displayed as ms-dos) is a legacy partition table type which doesn't support very large disks, and is not supported for UEFI boot
If the partition table type is gpt, the tools gdisk
and sgdisk
are appropriate to work with the partition table.
If the partition table type is shown as dos or ms-dos, the tools fdisk
or sfdisk
have to be used.
The parted
tool and its graphical frontend gparted
support both gpt and dos partition tables.
They can be used to create partitions as well as to determine the partition table type.
Determining the Partition Table Type
The parted
tool can be used to display a disk's current partition table type, e.g.:
:~ # parted /dev/sda print Model: ATA OCZ-VERTEX3 (scsi) Disk /dev/sda: 240GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 223GB 223GB linux-swap(v1) Linux RAID raid 2 223GB 240GB 17.2GB linux-swap(v1) Linux swap
In the example above, the partition table type is gpt.
Preparing a new RAID Partition on a new Disk
- Determine the type of the existing partition table, or create a new partition table, e.g. with the
parted
tool - Create a partition on the new disk drive
- Don't format the partition, but set the partition type to
linux raid
(type 0xFD)
Add some example commands
Copying a Partition Table to a new Disk
Please note the sequence of device parameters differs for both commands, in both cases above /dev/sda
is the existing disk with a valid partition table and /dev/sdb
is the new disk to which the partition table is to be copied.
Copy an msdos-type partition table from /dev/sda
to /dev/sdb
:
sfdisk -d /dev/sda | sfdisk /dev/sdb
Copy a gpt-type partition table from /dev/sda
to /dev/sdb
:
sgdisk -R /dev/sdb /dev/sda
Working With LVM
LVM distinguishes between different layers:
- A Physical Volume (
pv
) can be a single disk drive, or a whole RAID array - Physical Volumes are combined to implement a Volume Group (
vg
) - A Volume Group can contain one or more Logical Volumes (
lv
) - A Logical Volume can be used like a disk partition, it can be formatted and mounted
For each layer there are different tools available:
pvdisplay
,pvcreate
etc. can be used to manage a physical LVM volume (pv
)vgdisplay
,vgcreate
etc. can be used to manage a LVM volume group (vg
) consisting of one or more physical volume(s)lvdisplay
,lvcreate
etc. can be used to manage a logical LVM volume (lv
) allocated in a volume group
Setting up a new LVM Volume
- Format and mount the new volume group like a normal partition
- Add the mounting information to /etc/fstab so the partition can be mounted automatically
- Done.
Replacing a Physical Volume
- Move all data away from the old physical volume to some other free space in the volume group. This may take quite some time to complete, depending on the disk sizes. Of course the new physical volume must provide enough free space to take up all data from the old physical volume which is to be removed.
- Adjust the size of the logical volume if required
- Done.
Detailed Commands for LVM
Creating a new Physical Volume
Each physical disk or RAID array to be used with LVM has to be registered as Physical Volume (pv) first.
Assuming /dev/md1
is an existing RAID array used as physical volume which is to be replaced by a newly installed RAID array /dev/md2
:
pvcreate /dev/sda # Use a whole physical disk drive pvcreate /dev/md2 # Use a whole RAID array
pvdisplay can be used to display existing physical volumes, in the example below /dev/md0
is a pure RAID1 array from which the system boots, so it is not listed. The output shows an older volume /dev/md1
which already belongs to logical volume vg-data
, and a newly created /dev/md1
which doesn't belong to a volume group, yet:
pvdisplay — Physical volume — PV Name /dev/md1 VG Name vg-data PV Size 465.76 GiB / not usable 1.87 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 119234 Free PE 0 Allocated PE 119234 PV UUID NEfhlq-t3cH-ThOM-YGGV-uHWp-GzfL-23Agaj
/dev/md2
is a new physical volume of "953.74 GiB" size:
— NEW Physical volume — PV Name /dev/md2 VG Name PV Size 953.74 GiB Allocatable NO PE Size 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID kgj8w4-bU0c-5RAL-ETkY-Rnfk-arbh-TyNd8Q
Creating a Volume Group
vgcreate data /dev/md0
Creating a new Logical Volume
lvcreate -l 100%FREE -n data data
Adding a new Physical Volume to an Existing Volume Group
Add the new physical volume /dev/md2
to the existing volume group vg-data:
vgextend vg-data /dev/md2
Moving all Data Away From a Physical Volume
Move all data from the old physical disk /dev/md1
which is to be removed to some other free space in the volume group, e.g. /dev/md2
. This may take quite some time to complete, depending on the disk sizes:
pvmove /dev/md1
Removing a Physical Volume From a Volume Group
Removing a Physical Volume From LVM
Remove an old physical volume (could be a whole RAID1 array) /dev/md1
from the LVM.
Wipes the label on a device so that LVM will no longer recognize it as a physical volume:
pvremove /dev/md1
Adjusting the Size Of a Logical Volume
If the size of a logical volume is to be increased to the maximum size of the volume group e.g. after the volume group has been enlarged by a new physical volume then the following command can be used:
lvresize -r -l 100%VG /dev/vg-data/lv-data
The parameter -r
takes care that the size of the underlying file system in the volume group is also adjusted accordingly.
If the logical volume is to be shrunk, however, then the underlying file system needs to be shrunk first, before the logical volume is shrunk using the lvresize
command,
otherwise data may be lost. See man lvresize
for details.
Activating or Deactivating a Logical Volume
lvchange -an /dev/…. # deactivate a logical volume lvchange -ay /dev/…. # activate a logical volume
Checking the RAID / LVM Status
The command
cat /proc/mdstat
can be used to monitor the state of all RAID arrays in a system. For example:
~ # cat /proc/mdstat Personalities : [raid1] md2 : active raid1 sdd1[0] sdb1[2] 1000072192 blocks super 1.2 [2/2] [UU] bitmap: 0/8 pages [0KB], 65536KB chunk md0 : active raid1 sdc1[0] sda1[2] 217508864 blocks super 1.2 [2/2] [UU] bitmap: 2/2 pages [8KB], 65536KB chunk unused devices: <none>
In the example output above md0
is a RAID1 array from which the system boots, md2
is a RAID array which is used as a physical volume in a volume group providing a logical volume used as data partition. The original RAID array for LVM was md1
, but this has been replaced by a newer, larger array named md2
, as described above.
The most important thing here is that both arrays are labelled [UU]
which indicates that the array is healthy. if the status code is [U_]
or [_U]
this means that an array drive is faulty or missing.
Removing/Wiping RAID and LVM Metadata
After a disk has been removed from a RAID array or from an LVM volume then signatures may still be available on the disk, so if the disk is re-used the old metadata may appear again.
To fix this the metadata can be removed from the disk before it is re-used.
The safest way is to boot a live system like partedmagic from an USB stick, with only the old disk connected.
The mdadm
program refuses to remove the RAID metadata from the partition if the RAID array is still running,
and the RAID array can only be stopped if is isn't in use e.g. by a LVM volume group.
So if live system has recognized an old logical volume vgdata
based on a RAID device /dev/md1
which
consisted of the partition /dev/sda1
, then the following actions need to be taken:
- Deactivate the LVM logical volume so the RAID can also be stopped
- Remove the metadata by using these commands:
mdadm –zero-superblock /dev/sda1 wipefs –all /dev/sda
Related Bug Reports
- Novell Bug 862076 - use_lvmetad = 1 in lvm.conf triggers systemd to get into emergency target on boot
https://bugzilla.novell.com/show_bug.cgi?id=862076
— Martin Burnicki martin.burnicki@burnicki.net, last updated 2022-11-30