Linux RAID and LVM Setup

Dies ist eine alte Version des Dokuments!

This is a short introduction how to work with Linux RAID and LVM.

Please note: If the RAID array is to be built including SSDs (Solid State Disks) then there is also some general information on using SSDs under Linux.

Linux RAID provides redundancy of disks which increases the fault tolerance of storage systems and avoids data loss in case a disk drive fails.

LVM is a concept where several physical disks or even complete RAID arrays can be combined to provide one or more disk volumes. The advantage of LVM is that at any time, even during normal operation, LVM volumes can be changed in size, or disks can be added, removed, or replaced.

From the point of view of an LVM, a physical disk can be either a single disk drive, or a RAID array of disks. Using a RAID array of disks significantly increases the fault tolerance of the LVM volume.

Linux can be booted from a RAID1 array, but not from an LVM volume, so the boot partition should be located on a RAID1 array, not on an LVM volume.

LVM volumes can easily be increased in size by adding new disks, or replacing existing disks by newer, larger ones, so they can well be used for large data storage partitions.

Linux installation on a RAID1 and/or LVM can be cumbersome, depending on the Linux distribution, so it possibly makes sense to prepare at least a partition on the RAID1 array for booting and system installation first, and then select these partitions during installation instead of letting the installer create the partitions.

For example, at the time of this writing, openSUSE's YaST tool is unable to create a RAID1 array with a missing drive.

The mdadm tool is used to manage Linux RAID arrays. The index number of a RAID device, as in md0, has to be unique. if md0 already exists, a different, unused number has to be chosen for a new array.

Prepare a new Partition on /dev/sdx
Copy the partition table from /dev/sdx to /dev/sdy
Create a new RAID1 device /dev/md0 by specifying the 2 partitions /dev/sdx and /dev/sdy to be mirrored:

mdadm –create –verbose /dev/md0 –level=1 –raid-devices=2 /dev/sdx1 /dev/sdy 1

Assume there is an existing non-RAID drive /dev/sdx, and a single new drive /dev/sdy is available which should become part of the new RAID1 array:

Prepare a new partition on the new disk /dev/sdy
Create a new RAID1 array using only the new partition /dev/sdy1, and declare the other RAID component as missing:

mdadm –create –verbose /dev/md0 –level=1 –raid-devices=2 missing /dev/sdx 1

Partition the new RAID array, setup new LVM volume, or make it a physical LVM volume and add the new physical LVM volume to an existing LVM volume group
Copy all data from /dev/sda to the RAID array or LVM volume, so /dev/sda doesn't contain any precious data anymore
Copy the partition table from the new RAID disk /dev/sdb to the old disk /dev/sda
Add /dev/sda1 to the RAID array /dev/md0 to replace the missing drive:

mdadm –manage /dev/md0 –add /dev/sda1

Create an new partition on the RAID device /dev/md0
Format and mount the new partition /dev/md0p1 as usual
Add the mounting information to /etc/fstab so the partition can be mounted automatically
Done

Assuing an existing RAID1 array /dev/md0 with /dev/sda1 and /dev/sdb1, where the partitions (and thus the RAID array) should grow.

mdadm –fail /dev/md0 /dev/sda1
mdadm –remove /dev/md0 /dev/sda1

Resize or re-create partition /dev/sda1, then add the grown /dev/sda1 to the existing RAID:

mdadm –add /dev/md0 /dev/sda1

Wait until resync complete

Then update partition on /dev/sdb:
- Remove /dev/sdb from /dev/md0
- Copy partition table from /dev/sda to /dev/sdb
- Add new partition /dev/sdb1 to /dev/md0

Wait until resync complete

If the array has a write-intent bitmap, it is strongly recommended that you remove the bitmap before increasing the size of the array. Failure to observe this precaution can lead to the destruction of the array if the existing bitmap is insufficiently large, especially if the increased array size necessitates a change to the bitmap's chunksize.

mdadm –grow /dev/mdX –bitmap none
mdadm –grow /dev/mdX –size max
mdadm –wait /dev/mdX
mdadm –grow /dev/mdX –bitmap internal

If there is a partition on the RAID array /dev/md0 then this partition also need to be grown using a tool like parted.

Finally the file system on the partition needs to be extended. First make sure the file system is consistent, then axtednd it. For ext file systems:

fsck /dev/md0
resize2fs /dev/md0

If the RAID array is to become part of an LVM volume group then an LVM physical volume has to be created from the RAID array.

Some actions on a RAID array can only be taken after the RAID array has been stopped:

mdadm –stop /dev/md1

If the RAID array cant be stopped then possibly a partition on the device has to be unmounted first, or, if the RAID array is part of an LVM the logical volume has to be deactivated before.

In some cases a RAID array needs to be renamed. For example if the hostname is part of the array name, and has been changed.

In the example below the array is assembled as /dev/md125, and its old internal name is mediaplayer:2:

~ # mdadm –detail /dev/md125
/dev/md125:
        Version : 1.0
  Creation Time : Sun Oct 28 15:08:56 2012
     Raid Level : raid1
     Array Size : 5244916 (5.00 GiB 5.37 GB)
  Used Dev Size : 5244916 (5.00 GiB 5.37 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Thu Nov  3 23:50:33 2016
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : mediaplayer:2
           UUID : d35131f9:adf82215:10b667d6:9fab3ed3
         Events : 187

    Number   Major   Minor   RaidDevice State
       0       8       98        0      active sync   /dev/sdg2
       1       8       50        1      active sync   /dev/sdd2

We want to change the internal name to pc-martin:5 and assemble it as /dev/md/5 AKA /dev/md5. To do this, the array first has to be stopped, and then re-assembled with a new name:

~ # mdadm –stop /dev/md125
mdadm: stopped /dev/md125
~ # mdadm –assemble /dev/md/5 –name=pc-martin:5 –update=name /dev/sdg2 /dev/sdd2
mdadm: /dev/md/5 has been started with 2 drives.

It is important that both the parameters –name=pc-martin:5 and –update=name are given in the commands above.

~ # mdadm –detail /dev/md5
/dev/md5:
        Version : 1.0
  Creation Time : Sun Oct 28 15:08:56 2012
     Raid Level : raid1
     Array Size : 5244916 (5.00 GiB 5.37 GB)
  Used Dev Size : 5244916 (5.00 GiB 5.37 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Thu Nov  3 23:50:33 2016
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : pc-martin:5  (local to host pc-martin)
           UUID : d35131f9:adf82215:10b667d6:9fab3ed3
         Events : 187

    Number   Major   Minor   RaidDevice State
       0       8       98        0      active sync   /dev/sdg2
       1       8       50        1      active sync   /dev/sdd2

Finally an appropriate line has to added to or replaced in /etc/mdadm.conf:

~ # mdadm –detail –brief /dev/md5
ARRAY /dev/md5 metadata=1.0 name=pc-martin:5 UUID=d35131f9:adf82215:10b667d6:9fab3ed3

In many cases it is important to know the type of a partition table on a disk.

gpt is new partition table type which can be used for large disks, and is supported by UEFI boot
dos (sometimes displayed as ms-dos) is a legacy partition table type which doesn't support very large disks, and is not supported for UEFI boot

If the partition table type is gpt then the tools gdisk and sgdisk are appropriate to work with the partition table. If the partition table type is shown as dos or ms-dos then the tools fdisk or sfdisk have to be used.

The parted tool and its graphical frontend gparted support both gpt and dos partition tables. They can be used to create partitions as well as to determine the partition table type.

Determining The Partition Table Type

The parted tool can be used to display a disk's current partition table type, e.g.:

:~ # parted /dev/sda print
Model: ATA OCZ-VERTEX3 (scsi)
Disk /dev/sda: 240GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End    Size    File system     Name        Flags
 1      1049kB  223GB  223GB   linux-swap(v1)  Linux RAID  raid
 2      223GB   240GB  17.2GB  linux-swap(v1)  Linux swap

In the example above, the partition table type is gpt.

Preparing A New RAID Partition On A New Disk

Determine the type of the existing partition table, or create a new partition table, e.g. with the parted tool
Create a partition on the new disk drive
Don't format the partition, but set the partition type to linux raid (type 0xFD)

Add some example commands

Please note the sequence of device parameters differs for both commands, in both cases above /dev/sda is the existing disk with a valid partition table and /dev/sdb is the new disk to which the partition table is to be copied.

Copy an msdos-type partition table from /dev/sda to /dev/sdb:

sfdisk -d /dev/sda | sfdisk /dev/sdb

Copy a gpt-type partition table from /dev/sda to /dev/sdb:

sgdisk -R /dev/sdb /dev/sda

LVM distinguishes between different layers:

A Physical Volume (pv) can be a single disk drive, or a whole RAID array
Physical Volumes are combined to implement a Volume Group (vg)
A Volume Group can contain one or more Logical Volumes (lv)
A Logical Volume can be used like a disk partition, it can be formatted and mounted

For each layer there are different tools available:

pvdisplay, pvcreate etc. can be used to manage a physical LVM volume (pv)
vgdisplay, vgcreate etc. can be used to manage a LVM volume group (vg) consisting of one or more physical volume(s)
lvdisplay, lvcreate etc. can be used to manage a logical LVM volume (lv) allocated in a volume group

Create a physical volume from a disk drive or existing RAID array
Create a volume group from the physical volume
Create a logical volume from the volume group
Format and mount the new volume group like a normal partition
Add the mounting information to /etc/fstab so the partition can be mounted automatically
Done.

Create a new physical volume from a disk drive or existing RAID array
Add the new physical volume to the existing volume group
Move all data away from the old physical volume to some other free space in the volume group. This may take quite some time to complete, depending on the disk sizes. Of course the new physical volume must provide enough free space to take up all data from the old physical volume which is to be removed.
Remove the old physical volume from the volume group
Remove the old physical volume from LVM
Adjust the size of the logical volume if required
Done.

Each physical disk or RAID array to be used with LVM has to be registered as Physical Volume (pv) first.

Assuming /dev/md1 is an existing RAID array used as physical volume which is to be replaced by a newly installed RAID array /dev/md2:

pvcreate /dev/sda   # Use a whole physical disk drive
pvcreate /dev/md2   # Use a whole RAID array

pvdisplay can be used to display existing physical volumes, in the example below /dev/md0 is a pure RAID1 array from which the system boots, so it is not listed. The output shows an older volume /dev/md1 which already belongs to logical volume vg-data, and a newly created /dev/md1 which doesn't belong to a volume group, yet:

pvdisplay
  — Physical volume —
  PV Name               /dev/md1
  VG Name               vg-data
  PV Size               465.76 GiB / not usable 1.87 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              119234
  Free PE               0
  Allocated PE          119234
  PV UUID               NEfhlq-t3cH-ThOM-YGGV-uHWp-GzfL-23Agaj

/dev/md2 is a new physical volume of "953.74 GiB" size:

  — NEW Physical volume —
  PV Name               /dev/md2
  VG Name               
  PV Size               953.74 GiB
  Allocatable           NO
  PE Size               0   
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               kgj8w4-bU0c-5RAL-ETkY-Rnfk-arbh-TyNd8Q

vgcreate data /dev/md0

lvcreate -l 100%FREE -n data data

Add the new physical volume /dev/md2 to the existing volume group vg-data:

vgextend vg-data /dev/md2

Move all data from the old physical disk /dev/md1 which is to be removed to some other free space in the volume group, e.g. /dev/md2. This may take quite some time to complete, depending on the disk sizes:

pvmove /dev/md1

Remove an unused physical volume /dev/md1 from the volume group vg-data:

vgreduce vg-data /dev/md1

Remove an old physical volume (could be a whole RAID1 array) /dev/md1 from the LVM. Wipes the label on a device so that LVM will no longer recognize it as a physical volume:

pvremove /dev/md1

If the size of a logical volume is to be increased to the maximum size of the volume group e.g. after the volume group has been enlarged by a new physical volume then the following command can be used:

lvresize -r -l 100%VG /dev/vg-data/lv-data

The parameter -r takes care that the size of the underlying file system in the volume group is also adjusted accordingly. If the logical volume is to be shrunk, however, then the underlying file system needs to be shrunk first, before the logical volume is shrunk using the lvresize command, otherwise data may be lost. See man lvresize for details.

lvchange -an /dev/….    # deactivate a logical volume
lvchange -ay /dev/….    # activate a logical volume

The command

cat /proc/mdstat

can be used to monitor the state of all RAID arrays in a system. For example:

~ # cat /proc/mdstat 
Personalities : [raid1] 
md2 : active raid1 sdd1[0] sdb1[2]
      1000072192 blocks super 1.2 [2/2] [UU]
      bitmap: 0/8 pages [0KB], 65536KB chunk

md0 : active raid1 sdc1[0] sda1[2]
      217508864 blocks super 1.2 [2/2] [UU]
      bitmap: 2/2 pages [8KB], 65536KB chunk

unused devices: <none>

In the example output above md0 is a RAID1 array from which the system boots, md2 is a RAID array which is used as a physical volume in a volume group providing a logical volume used as data partition. The original RAID array for LVM was md1, but this has been replaced by a newer, larger array named md2, as described above.

The most important thing here is that both arrays are labelled [UU] which indicates that the array is healthy. if the status code is [U_] or [_U] this means that an array drive is faulty or missing.

After a disk has been removed from a RAID array or from an LVM volume then signatures may still be available on the disk, so if the disk is re-used the old metadata may appear again.

To fix this the metadata can be removed from the disk before it is re-used.

The safest way is to boot a live system like partedmagic from an USB stick, with only the old disk connected. The mdadm program refuses to remove the RAID metadata from the partition if the RAID array is still running, and the RAID array can only be stopped if is isn't in use e.g. by a LVM volume group. So if live system has recognized an old logical volume vgdata based on a RAID device /dev/md1 which consisted of the partition /dev/sda1, then the following actions need to be taken:

Deactivate the LVM logical volume so the RAID can also be stopped
Stop the RAID array
Remove the metadata by using these commands:

mdadm –zero-superblock /dev/sda1
wipefs –all /dev/sda

Novell Bug 862076 - use_lvmetad = 1 in lvm.conf triggers systemd to get into emergency target on boot
https://bugzilla.novell.com/show_bug.cgi?id=862076

— Martin Burnicki 2016-04-08 12:20

Linux RAID and LVM Setup

Working With A RAID Array

Setting Up A RAID1 Array From 2 New Disks

Converting An Existing Disk Drive Into A New RAID1 Array

Using A New RAID Array Standalone

Growing an existing RAID ARRAY

Setting Up A New LVM Volume From A RAID1 Array

Stopping A RAID Array

Renaming A RAID Array

Detailed Commands For Disk Partitions and RAID

Partition Table Types

Determining The Partition Table Type

Preparing A New RAID Partition On A New Disk

Copying A Partition Table To A New Disk

Working With LVM

Setting Up A New LVM Volume

Replacing A Physical Volume

Detailed Commands For LVM

Creating A New Physical Volume

Creating A Volume Group

Creating A New Logical Volume

Adding A New Physical Volume To An Existing Volume Group

Moving All Data Away From A Physical Volume

Removing A Physical Volume From A Volume Group

Removing A Physical Volume From LVM

Adjusting The Size Of A Logical Volume

Activating Or Deactivating A Logical Volume

Checking The RAID / LVM Status

Removing/Wiping RAID and LVM Metadata

Related Bug Reports

Tipps & Tricks