====== Linux RAID and LVM Setup ====== This is a short introduction how to work with Linux [[https://en.wikipedia.org/wiki/RAID|RAID]] and [[https://en.wikipedia.org/wiki/Logical_volume_management|LVM]]. **Please note:** If the RAID array is to be built including SSDs (Solid State Disks) then there is also some general information on using [[ssd_under_linux|SSDs under Linux]]. **Linux RAID** provides **redundancy** of disks which increases the fault tolerance of storage systems and avoids data loss in case a disk drive fails. **LVM** is a concept where several physical disks or even complete RAID arrays can be combined to provide one or more disk volumes. The advantage of LVM is that at any time, even during normal operation, LVM volumes can be changed in size, or disks can be added, removed, or replaced. From the point of view of an LVM, a physical disk can be either a single disk drive, or a RAID array of disks. Using a RAID array of disks significantly increases the fault tolerance of the LVM volume. * Linux can be booted from a RAID1 array, but not from an LVM volume, so the boot partition should be located on a RAID1 array, not on an LVM volume. * LVM volumes can easily be increased in size by adding new disks, or replacing existing disks by newer, larger ones, so they can well be used for large data storage partitions. Linux installation on a RAID1 and/or LVM can be cumbersome, depending on the Linux distribution, so it possibly makes sense to prepare at least a partition on the RAID1 array for booting and system installation first, and then select these partitions during installation instead of letting the installer create the partitions. For example, at the time of this writing, openSUSE's YaST tool is unable to create a RAID1 array with a //missing// drive. ===== Working With a RAID Array ===== The **''mdadm''** tool is used to manage Linux RAID arrays. The index number of a RAID device, as in ''md**0**'', has to be unique. if ''md0'' already exists, a different, unused number has to be chosen for a new array. \\ ==== Setting Up a RAID1 Array From 2 New Disks ==== * [[#Preparing A New RAID Partition On A New Disk|Prepare a new Partition]] on ''/dev/sd**x**'' * [[#Copying A Partition Table To A New Disk|Copy the partition table]] from ''/dev/sd**x**'' to ''/dev/sd**y**'' * Create a new RAID1 device ''/dev/md**0**'' by specifying the 2 partitions ''/dev/sd**x**'' and ''/dev/sd**y**'' to be mirrored: mdadm ~~codedoc:clean:--create~~ ~~codedoc:clean:--verbose~~ /dev/md~~codedoc:0~~ ~~codedoc:clean:--level=1~~ ~~codedoc:clean:--raid-devices=2~~ /dev/sd~~codedoc:x~~1 /dev/sd~~codedoc:y~~1 \\ ==== Converting an Existing Disk Drive Into a new RAID1 Array ==== Assume there is an existing non-RAID drive ''/dev/sd**x**'', and a single new drive ''/dev/sd**y**'' is available which should become part of the new RAID1 array: * [[#Preparing A New RAID Partition On A New Disk|Prepare a new partition]] on the new disk ''/dev/sd**y**'' * Create a new RAID1 array using only the new partition ''/dev/sd**y**1'', and declare the other RAID component as **//''missing''//**: mdadm ~~codedoc:clean:--create~~ ~~codedoc:clean:--verbose~~ /dev/md~~codedoc:0~~ ~~codedoc:clean:--level=1~~ ~~codedoc:clean:--raid-devices=2~~ ~~codedoc:missing~~ /dev/sd~~codedoc:y~~1 * [[#Detailed Commands For Disk Partitions and RAID|Partition the new RAID array]], [[#Setting Up A New LVM Volume|setup new LVM volume]], or [[#Creating A New Physical Volume|make it a physical LVM volume]] and [[#Adding A New Physical Volume To An Existing Volume Group|add the new physical LVM volume]] to an existing LVM volume group * Copy all data from ''/dev/sd**x**'' to the RAID array or LVM volume, so ''/dev/sd**x**'' contains no more any precious data * [[#Copying A Partition Table To A New Disk|Copy the partition table]] from the new RAID disk ''/dev/sd**y**'' to the old disk ''/dev/sd**x**'' * Add ''/dev/sd**x**1'' to the RAID array ''/dev/md**0**'' to replace the ''//missing//'' drive: mdadm ~~codedoc:clean:--manage~~ /dev/md~~codedoc:0~~ ~~codedoc:clean:--add~~ /dev/sd~~codedoc:x~~1 \\ ==== Using A New RAID Array Standalone ==== * Create an new partition on the RAID device ''/dev/md**0**'' * Format and mount the new partition ''/dev/md~~codedoc:0~~p1'' as usual * Add the mounting information to ''/etc/fstab'' so the partition can be mounted automatically * Done \\ ==== Growing an Existing RAID ARRAY ==== Assuing an existing RAID1 array ''/dev/md**0**'' with ''/dev/sda1'' and ''/dev/sdb1'', where the partitions (and thus the RAID array) should grow. mdadm --fail /dev/md~~codedoc:0~~ /dev/sda1 mdadm --remove /dev/md~~codedoc:0~~ /dev/sda1 * Resize or re-create partition ''/dev/sda1'', then add the grown ''/dev/sda1'' to the existing RAID: mdadm --add /dev/md~~codedoc:0~~ /dev/sda1 * Wait until resync complete * Then update partition on ''/dev/sdb'': * Remove ''/dev/sdb'' from ''/dev/md**0**'' * Copy partition table from ''/dev/sda'' to ''/dev/sdb'' * Add new partition ''/dev/sdb1'' to ''/dev/md**0**'' * Wait until resync complete If the array has a **write-intent bitmap**, it is strongly recommended that you remove the bitmap before increasing the size of the array. Failure to observe this precaution can lead to the destruction of the array if the existing bitmap is insufficiently large, especially if the increased array size necessitates a change to the bitmap's chunksize. mdadm --grow /dev/mdX --bitmap none mdadm --grow /dev/mdX --size max mdadm --wait /dev/mdX mdadm --grow /dev/mdX --bitmap internal FIXME If there is a partition on the RAID array ''/dev/md**0**'' then this partition also need to be grown using a tool like ''parted''. Finally the file system on the partition needs to be extended. First make sure the file system is consistent, then axtednd it. For ''ext'' file systems: fsck /dev/md~~codedoc:0~~ resize2fs /dev/md~~codedoc:0~~ \\ ==== Setting up a new LVM Volume From a RAID1 Array ==== If the RAID array is to become part of an LVM volume group, [[#Creating A New Physical Volume|an LVM physical volume has to be created]] from the RAID array. \\ ==== Stopping a RAID Array ==== Some actions on a RAID array can only be taken after the RAID array has been stopped: mdadm ~~codedoc:clean:--stop~~ /dev/md~~codedoc:0~~ If the RAID array cant be stopped, a partition on the device has to be possibly unmounted first, or, if the RAID array is part of an LVM, the [[#Activating Or Deactivating A Logical Volume|logical volume has to be deactivated before]]. \\ ==== Renaming a RAID Array ==== In some cases a RAID array needs to be renamed. For example, if the hostname is part of the array name, and has been changed. In the example below the array is assembled as ''/dev/md125'', and its old internal name is ''mediaplayer:2'': ~ # mdadm ~~codedoc:clean:--detail~~ /dev/md~~codedoc:125~~ /dev/md~~codedoc:125~~: Version : 1.0 Creation Time : Sun Oct 28 15:08:56 2012 Raid Level : raid1 Array Size : 5244916 (5.00 GiB 5.37 GB) Used Dev Size : 5244916 (5.00 GiB 5.37 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Thu Nov 3 23:50:33 2016 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : mediaplayer:2 UUID : d35131f9:adf82215:10b667d6:9fab3ed3 Events : 187 Number Major Minor RaidDevice State 0 8 98 0 active sync /dev/sdg2 1 8 50 1 active sync /dev/sdd2 We want to change the internal name to ''pc-martin:5'' and assemble it as ''/dev/md/5'' AKA ''/dev/md5''. To do this, the array first has to be [[#Stopping A RAID Array|stopped]], and then re-assembled with a new name: ~ # mdadm ~~codedoc:clean:--stop~~ /dev/md125 mdadm: stopped /dev/md125 ~ # mdadm ~~codedoc:clean:--assemble~~ /dev/md/5 ~~codedoc:clean:--name=~~pc-martin:5 ~~codedoc:clean:--update=~~name /dev/sdg2 /dev/sdd2 mdadm: /dev/md/5 has been started with 2 drives. It is important that **both** the parameters ''--name=pc-martin:5'' and ''--update=name'' are given in the commands above. ~ # mdadm ~~codedoc:clean:--detail~~ /dev/md5 /dev/md5: Version : 1.0 Creation Time : Sun Oct 28 15:08:56 2012 Raid Level : raid1 Array Size : 5244916 (5.00 GiB 5.37 GB) Used Dev Size : 5244916 (5.00 GiB 5.37 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Thu Nov 3 23:50:33 2016 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : pc-martin:5 (local to host pc-martin) UUID : d35131f9:adf82215:10b667d6:9fab3ed3 Events : 187 Number Major Minor RaidDevice State 0 8 98 0 active sync /dev/sdg2 1 8 50 1 active sync /dev/sdd2 Finally an appropriate line has to added to or replaced in ''/etc/mdadm.conf'': ~ # mdadm ~~codedoc:clean:--detail~~ ~~codedoc:clean:--brief~~ /dev/md5 ARRAY /dev/md5 metadata=1.0 name=pc-martin:5 UUID=d35131f9:adf82215:10b667d6:9fab3ed3 \\ ===== Detailed Commands for Disk Partitions and RAID ===== ==== Partition Table Types ==== In many cases it is important to know the type of a partition table on a disk. * **gpt** is new partition table type which can be used for large disks, and is supported by UEFI boot * **dos** (sometimes displayed as **ms-dos**) is a legacy partition table type which doesn't support very large disks, and is not supported for UEFI boot If the partition table type is **gpt**, the tools **''gdisk''** and **''sgdisk''** are appropriate to work with the partition table. If the partition table type is shown as **dos** or **ms-dos**, the tools **''fdisk''** or **''sfdisk''** have to be used. The **''parted''** tool and its graphical frontend **''gparted''** support both **gpt** and **dos** partition tables. They can be used to create partitions as well as to [[#Determining The Partition Table Type|determine the partition table type]]. \\ === Determining the Partition Table Type === The **''parted''** tool can be used to display a disk's current partition table type, e.g.: :~ # parted /dev/sda print Model: ATA OCZ-VERTEX3 (scsi) Disk /dev/sda: 240GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 223GB 223GB linux-swap(v1) Linux RAID raid 2 223GB 240GB 17.2GB linux-swap(v1) Linux swap In the example above, the partition table type is **gpt**. \\ === Preparing a new RAID Partition on a new Disk === * Determine the [[#Partition Table Types|type of the existing partition table]], or create a new partition table, e.g. with the ''parted'' tool * Create a partition on the new disk drive * Don't format the partition, but set the partition type to ''linux raid'' (type 0xFD) FIXME Add some example commands \\ ==== Copying a Partition Table to a new Disk ==== Please note the **//sequence of device parameters differs//** for both commands, in both cases above ''/dev/sda'' is the **//existing//** disk with a valid partition table and ''/dev/sdb'' is the **//new//** disk to which the partition table is to be copied. Copy an **msdos**-type partition table from ''/dev/sda'' to ''/dev/sdb'': sfdisk -d /dev/sda | sfdisk /dev/sdb Copy a **gpt**-type partition table from ''/dev/sda'' to ''/dev/sdb'': sgdisk -R /dev/sdb /dev/sda \\ ===== Working With LVM ===== LVM distinguishes between different layers: * A **Physical Volume** (''pv'') can be a single disk drive, or a whole RAID array * Physical Volumes are combined to implement a Volume Group (''vg'') * A **Volume Group** can contain one or more Logical Volumes (''lv'') * A **Logical Volume** can be used like a disk partition, it can be formatted and mounted For each layer there are different tools available: * **''pvdisplay''**, **''pvcreate''** etc. can be used to manage a physical LVM volume (''pv'') * **''vgdisplay''**, **''vgcreate''** etc. can be used to manage a LVM volume group (''vg'') consisting of one or more physical volume(s) * **''lvdisplay''**, **''lvcreate''** etc. can be used to manage a logical LVM volume (''lv'') allocated in a volume group \\ ==== Setting up a new LVM Volume ==== * [[#Creating A New Physical Volume|Create a physical volume from a disk drive or existing RAID array]] * [[#Creating A Volume Group|Create a volume group from the physical volume]] * [[#Creating A New Logical Volume|Create a logical volume from the volume group]] * Format and mount the new volume group like a normal partition * Add the mounting information to /etc/fstab so the partition can be mounted automatically * Done. \\ ==== Replacing a Physical Volume ==== * [[#Creating A New Physical Volume|Create a new physical volume from a disk drive or existing RAID array]] * [[#Adding A New Physical Volume To An Existing Volume Group|Add the new physical volume to the existing volume group]] * [[#Moving All Data Away From A Physical Volume|Move all data away]] from the old physical volume to some other free space in the volume group. This may take quite some time to complete, depending on the disk sizes. Of course the new physical volume must provide enough free space to take up all data from the old physical volume which is to be removed. * [[#Removing A Physical Volume From A Volume Group|Remove the old physical volume from the volume group]] * [[#Removing A Physical Volume From LVM|Remove the old physical volume from LVM]] * [[#Adjusting The Size Of A Logical Volume|Adjust the size of the logical volume]] if required * Done. \\ ===== Detailed Commands for LVM ===== ==== Creating a new Physical Volume ==== Each physical disk or RAID array to be used with LVM has to be registered as Physical Volume (pv) first. Assuming ''/dev/md1'' is an existing RAID array used as physical volume which is to be replaced by a newly installed RAID array ''/dev/md2'': pvcreate /dev/sda # Use a whole physical disk drive pvcreate /dev/md2 # Use a whole RAID array **pvdisplay** can be used to display existing physical volumes, in the example below ''/dev/md**0**'' is a pure RAID1 array from which the system boots, so it is not listed. The output shows an older volume ''/dev/md1'' which already belongs to logical volume ''vg-data'', and a newly created ''/dev/md1'' which doesn't belong to a volume group, yet: pvdisplay --- Physical volume --- PV Name /dev/md1 VG Name vg-data PV Size 465.76 GiB / not usable 1.87 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 119234 Free PE 0 Allocated PE 119234 PV UUID NEfhlq-t3cH-ThOM-YGGV-uHWp-GzfL-23Agaj ''/dev/md2'' is a new physical volume of "953.74 GiB" size: --- NEW Physical volume --- PV Name /dev/md2 VG Name PV Size 953.74 GiB Allocatable NO PE Size 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID kgj8w4-bU0c-5RAL-ETkY-Rnfk-arbh-TyNd8Q \\ ==== Creating a Volume Group ==== vgcreate data /dev/md~~codedoc:0~~ \\ ==== Creating a new Logical Volume ==== lvcreate -l 100%FREE -n data data \\ ==== Adding a new Physical Volume to an Existing Volume Group ==== Add the new physical volume ''/dev/md2'' to the existing volume group vg-data: vgextend vg-data /dev/md2 \\ ==== Moving all Data Away From a Physical Volume ==== Move all data from the old physical disk ''/dev/md1'' which is to be removed to some other free space in the volume group, e.g. ''/dev/md2''. This may take quite some time to complete, depending on the disk sizes: pvmove /dev/md1 \\ ==== Removing a Physical Volume From a Volume Group ==== Remove an [[#Moving All Data Away From A Physical Volume|unused]] physical volume ''/dev/md1'' from the volume group ''vg-data'': vgreduce vg-data /dev/md1 \\ ==== Removing a Physical Volume From LVM ==== Remove an old physical volume (could be a whole RAID1 array) ''/dev/md1'' from the LVM. Wipes the label on a device so that LVM will no longer recognize it as a physical volume: pvremove /dev/md1 \\ ==== Adjusting the Size Of a Logical Volume ==== If the size of a logical volume is to be increased to the maximum size of the volume group e.g. after the volume group has been enlarged by a new physical volume then the following command can be used: lvresize -r -l 100%VG /dev/vg-data/lv-data The parameter ''-r'' takes care that the size of the underlying file system in the volume group is also adjusted accordingly. If the logical volume is to be //**shrunk**//, however, then the underlying //**file system needs to be shrunk first**//, before the logical volume is shrunk using the ''lvresize'' command, otherwise data may be lost. See ''man lvresize'' for details. \\ ==== Activating or Deactivating a Logical Volume ==== FIXME lvchange -an /dev/.... # deactivate a logical volume lvchange -ay /dev/.... # activate a logical volume \\ ===== Checking the RAID / LVM Status ===== The command cat /proc/mdstat can be used to monitor the state of all RAID arrays in a system. For example: ~ # cat /proc/mdstat Personalities : [raid1] md2 : active raid1 sdd1[0] sdb1[2] 1000072192 blocks super 1.2 [2/2] [UU] bitmap: 0/8 pages [0KB], 65536KB chunk md0 : active raid1 sdc1[0] sda1[2] 217508864 blocks super 1.2 [2/2] [UU] bitmap: 2/2 pages [8KB], 65536KB chunk unused devices: In the example output above ''md0'' is a RAID1 array from which the system boots, ''md2'' is a RAID array which is used as a physical volume in a volume group providing a logical volume used as data partition. The original RAID array for LVM was ''md1'', but this has been replaced by a newer, larger array named ''md2'', as described [[#Replacing a Physical Volume|above]]. The most important thing here is that both arrays are labelled **''[UU]''** which indicates that the array is healthy. if the status code is **''[U_]''** or **''[_U]''** this means that an array drive is faulty or missing. \\ ===== Removing/Wiping RAID and LVM Metadata ===== After a disk has been removed from a RAID array or from an LVM volume then signatures may still be available on the disk, so if the disk is re-used the old metadata may appear again. To fix this the metadata can be removed from the disk before it is re-used. The safest way is to boot a live system like [[https://partedmagic.com|partedmagic]] from an USB stick, with only the old disk connected. The ''mdadm'' program refuses to remove the RAID metadata from the partition if the RAID array is still running, and the RAID array can only be stopped if is isn't in use e.g. by a LVM volume group. So if live system has recognized an old logical volume ''vgdata'' based on a RAID device ''/dev/md1'' which consisted of the partition ''/dev/sda1'', then the following actions need to be taken: * [[#Activating Or Deactivating A Logical Volume|Deactivate the LVM logical volume]] so the RAID can also be stopped * [[#Stopping A RAID Array|Stop the RAID array]] * Remove the metadata by using these commands: mdadm --zero-superblock /dev/sda1 wipefs --all /dev/sda \\ ===== Related Bug Reports ===== * Novell Bug 862076 - use_lvmetad = 1 in lvm.conf triggers systemd to get into emergency target on boot\\ https://bugzilla.novell.com/show_bug.cgi?id=862076 ---- --- //Martin Burnicki [[martin.burnicki@burnicki.net]], last updated 2022-11-30//