Red Hat Training

A Red Hat training course is available for RHEL 8

Chapter 17. Troubleshooting LVM

You can use Logical Volume Manager (LVM) tools to troubleshoot a variety of issues in LVM volumes and groups.

17.1. Gathering diagnostic data on LVM

If an LVM command is not working as expected, you can gather diagnostics in the following ways.

Procedure

  • Use the following methods to gather different kinds of diagnostic data:

    • Add the -v argument to any LVM command to increase the verbosity level of the command output. Verbosity can be further increased by adding additional v’s. A maximum of four such v’s is allowed, for example, -vvvv.
    • In the log section of the /etc/lvm/lvm.conf configuration file, increase the value of the level option. This causes LVM to provide more details in the system log.
    • If the problem is related to the logical volume activation, enable LVM to log messages during the activation:

      1. Set the activation = 1 option in the log section of the /etc/lvm/lvm.conf configuration file.
      2. Execute the LVM command with the -vvvv option.
      3. Examine the command output.
      4. Reset the activation option to 0.

        If you do not reset the option to 0, the system might become unresponsive during low memory situations.

    • Display an information dump for diagnostic purposes:

      # lvmdump
    • Display additional system information:

      # lvs -v
      # pvs --all
      # dmsetup info --columns
    • Examine the last backup of the LVM metadata in the /etc/lvm/backup/ directory and archived versions in the /etc/lvm/archive/ directory.
    • Check the current configuration information:

      # lvmconfig
    • Check the /run/lvm/hints cache file for a record of which devices have physical volumes on them.

Additional resources

  • lvmdump(8) man page

17.2. Displaying information about failed LVM devices

Troubleshooting information about a failed Logical Volume Manager (LVM) volume can help you determine the reason of the failure. You can check the following examples of the most common LVM volume failures.

Example 17.1. Failed volume groups

In this example, one of the devices that made up the volume group myvg failed. The volume group usability then depends on the type of failure. For example, the volume group is still usable if RAID volumes are also involved. You can also see information about the failed device.

# vgs --options +devices
 /dev/vdb1: open failed: No such device or address
 /dev/vdb1: open failed: No such device or address
  WARNING: Couldn't find device with uuid 42B7bu-YCMp-CEVD-CmKH-2rk6-fiO9-z1lf4s.
  WARNING: VG myvg is missing PV 42B7bu-YCMp-CEVD-CmKH-2rk6-fiO9-z1lf4s (last written to /dev/sdb1).
  WARNING: Couldn't find all devices for LV myvg/mylv while checking used and assumed devices.

VG    #PV #LV #SN Attr   VSize  VFree  Devices
myvg   2   2   0 wz-pn- <3.64t <3.60t [unknown](0)
myvg   2   2   0 wz-pn- <3.64t <3.60t [unknown](5120),/dev/vdb1(0)

Example 17.2. Failed logical volume

In this example, one of the devices failed. This can be a reason for the logical volume in the volume group to fail. The command output shows the failed logical volumes.

# lvs --all --options +devices

  /dev/vdb1: open failed: No such device or address
  /dev/vdb1: open failed: No such device or address
  WARNING: Couldn't find device with uuid 42B7bu-YCMp-CEVD-CmKH-2rk6-fiO9-z1lf4s.
  WARNING: VG myvg is missing PV 42B7bu-YCMp-CEVD-CmKH-2rk6-fiO9-z1lf4s (last written to /dev/sdb1).
  WARNING: Couldn't find all devices for LV myvg/mylv while checking used and assumed devices.

  LV    VG  Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
  mylv myvg -wi-a---p- 20.00g                                                     [unknown](0)                                                 [unknown](5120),/dev/sdc1(0)

Example 17.3. Failed image of a RAID logical volume

The following examples show the command output from the pvs and lvs utilities when an image of a RAID logical volume has failed. The logical volume is still usable.

# pvs

  Error reading device /dev/sdc1 at 0 length 4.

  Error reading device /dev/sdc1 at 4096 length 4.

  Couldn't find device with uuid b2J8oD-vdjw-tGCA-ema3-iXob-Jc6M-TC07Rn.

  WARNING: Couldn't find all devices for LV myvg/my_raid1_rimage_1 while checking used and assumed devices.

  WARNING: Couldn't find all devices for LV myvg/my_raid1_rmeta_1 while checking used and assumed devices.

  PV           VG         Fmt  Attr PSize    PFree
  /dev/sda2    rhel_bp-01 lvm2 a--  <464.76g    4.00m
  /dev/sdb1    myvg       lvm2 a--  <836.69g  736.68g
  /dev/sdd1    myvg       lvm2 a--  <836.69g <836.69g
  /dev/sde1    myvg       lvm2 a--  <836.69g <836.69g
  [unknown]    myvg       lvm2 a-m  <836.69g  736.68g
# lvs -a --options name,vgname,attr,size,devices myvg

  Couldn't find device with uuid b2J8oD-vdjw-tGCA-ema3-iXob-Jc6M-TC07Rn.

  WARNING: Couldn't find all devices for LV myvg/my_raid1_rimage_1 while checking used and assumed devices.

  WARNING: Couldn't find all devices for LV myvg/my_raid1_rmeta_1 while checking used and assumed devices.

  LV                  VG   Attr       LSize   Devices
  my_raid1            myvg rwi-a-r-p- 100.00g my_raid1_rimage_0(0),my_raid1_rimage_1(0)
  [my_raid1_rimage_0] myvg iwi-aor--- 100.00g /dev/sdb1(1)
  [my_raid1_rimage_1] myvg Iwi-aor-p- 100.00g [unknown](1)
  [my_raid1_rmeta_0]  myvg ewi-aor---   4.00m /dev/sdb1(0)
  [my_raid1_rmeta_1]  myvg ewi-aor-p-   4.00m [unknown](0)

17.3. Removing lost LVM physical volumes from a volume group

If a physical volume fails, you can activate the remaining physical volumes in the volume group and remove all the logical volumes that used that physical volume from the volume group.

Procedure

  1. Activate the remaining physical volumes in the volume group:

    # vgchange --activate y --partial myvg
  2. Check which logical volumes will be removed:

    # vgreduce --removemissing --test myvg
  3. Remove all the logical volumes that used the lost physical volume from the volume group:

    # vgreduce --removemissing --force myvg
  4. Optional: If you accidentally removed logical volumes that you wanted to keep, you can reverse the vgreduce operation:

    # vgcfgrestore myvg
    Warning

    If you remove a thin pool, LVM cannot reverse the operation.

17.4. Finding the metadata of a missing LVM physical volume

If the volume group’s metadata area of a physical volume is accidentally overwritten or otherwise destroyed, you get an error message indicating that the metadata area is incorrect, or that the system was unable to find a physical volume with a particular UUID.

This procedure finds the latest archived metadata of a physical volume that is missing or corrupted.

Procedure

  1. Find the archived metadata file of the volume group that contains the physical volume. The archived metadata files are located at the /etc/lvm/archive/volume-group-name_backup-number.vg path:

    # cat /etc/lvm/archive/myvg_00000-1248998876.vg

    Replace 00000-1248998876 with the backup-number. Select the last known valid metadata file, which has the highest number for the volume group.

  2. Find the UUID of the physical volume. Use one of the following methods.

    • List the logical volumes:

      # lvs --all --options +devices
      
        Couldn't find device with uuid 'FmGRh3-zhok-iVI8-7qTD-S5BI-MAEN-NYM5Sk'.
    • Examine the archived metadata file. Find the UUID as the value labeled id = in the physical_volumes section of the volume group configuration.
    • Deactivate the volume group using the --partial option:

      # vgchange --activate n --partial myvg
      
        PARTIAL MODE. Incomplete logical volumes will be processed.
        WARNING: Couldn't find device with uuid 42B7bu-YCMp-CEVD-CmKH-2rk6-fiO9-z1lf4s.
        WARNING: VG myvg is missing PV 42B7bu-YCMp-CEVD-CmKH-2rk6-fiO9-z1lf4s (last written to /dev/vdb1).
        0 logical volume(s) in volume group "myvg" now active

17.5. Restoring metadata on an LVM physical volume

This procedure restores metadata on a physical volume that is either corrupted or replaced with a new device. You might be able to recover the data from the physical volume by rewriting the metadata area on the physical volume.

Warning

Do not attempt this procedure on a working LVM logical volume. You will lose your data if you specify the incorrect UUID.

Prerequisites

Procedure

  1. Restore the metadata on the physical volume:

    # pvcreate --uuid physical-volume-uuid \
               --restorefile /etc/lvm/archive/volume-group-name_backup-number.vg \
               block-device
    Note

    The command overwrites only the LVM metadata areas and does not affect the existing data areas.

    Example 17.4. Restoring a physical volume on /dev/vdb1

    The following example labels the /dev/vdb1 device as a physical volume with the following properties:

    • The UUID of FmGRh3-zhok-iVI8-7qTD-S5BI-MAEN-NYM5Sk
    • The metadata information contained in VG_00050.vg, which is the most recent good archived metadata for the volume group
    # pvcreate --uuid "FmGRh3-zhok-iVI8-7qTD-S5BI-MAEN-NYM5Sk" \
               --restorefile /etc/lvm/archive/VG_00050.vg \
               /dev/vdb1
    
      ...
      Physical volume "/dev/vdb1" successfully created
  2. Restore the metadata of the volume group:

    # vgcfgrestore myvg
    
      Restored volume group myvg
  3. Display the logical volumes on the volume group:

    # lvs --all --options +devices myvg

    The logical volumes are currently inactive. For example:

      LV     VG   Attr   LSize   Origin Snap%  Move Log Copy%  Devices
      mylv myvg   -wi--- 300.00G                               /dev/vdb1 (0),/dev/vdb1(0)
      mylv myvg   -wi--- 300.00G                               /dev/vdb1 (34728),/dev/vdb1(0)
  4. If the segment type of the logical volumes is RAID, resynchronize the logical volumes:

    # lvchange --resync myvg/mylv
  5. Activate the logical volumes:

    # lvchange --activate y myvg/mylv
  6. If the on-disk LVM metadata takes at least as much space as what overrode it, this procedure can recover the physical volume. If what overrode the metadata went past the metadata area, the data on the volume may have been affected. You might be able to use the fsck command to recover that data.

Verification steps

  • Display the active logical volumes:

    # lvs --all --options +devices
    
      LV     VG   Attr   LSize   Origin Snap%  Move Log Copy%  Devices
     mylv myvg   -wi--- 300.00G                               /dev/vdb1 (0),/dev/vdb1(0)
     mylv myvg   -wi--- 300.00G                               /dev/vdb1 (34728),/dev/vdb1(0)

17.6. Rounding errors in LVM output

LVM commands that report the space usage in volume groups round the reported number to 2 decimal places to provide human-readable output. This includes the vgdisplay and vgs utilities.

As a result of the rounding, the reported value of free space might be larger than what the physical extents on the volume group provide. If you attempt to create a logical volume the size of the reported free space, you might get the following error:

Insufficient free extents

To work around the error, you must examine the number of free physical extents on the volume group, which is the accurate value of free space. You can then use the number of extents to create the logical volume successfully.

17.7. Preventing the rounding error when creating an LVM volume

When creating an LVM logical volume, you can specify the number of logical extents of the logical volume to avoid rounding error.

Procedure

  1. Find the number of free physical extents in the volume group:

    # vgdisplay myvg

    Example 17.5. Free extents in a volume group

    For example, the following volume group has 8780 free physical extents:

    --- Volume group ---
     VG Name               myvg
     System ID
     Format                lvm2
     Metadata Areas        4
     Metadata Sequence No  6
     VG Access             read/write
    [...]
    Free  PE / Size       8780 / 34.30 GB
  2. Create the logical volume. Enter the volume size in extents rather than bytes.

    Example 17.6. Creating a logical volume by specifying the number of extents

    # lvcreate --extents 8780 --name mylv myvg

    Example 17.7. Creating a logical volume to occupy all the remaining space

    Alternatively, you can extend the logical volume to use a percentage of the remaining free space in the volume group. For example:

    # lvcreate --extents 100%FREE --name mylv myvg

Verification steps

  • Check the number of extents that the volume group now uses:

    # vgs --options +vg_free_count,vg_extent_count
    
      VG     #PV #LV #SN  Attr   VSize   VFree  Free  #Ext
      myvg   2   1   0   wz--n- 34.30G    0    0     8780

17.8. LVM metadata and their location on disk

LVM headers and metadata areas are available in different offsets and sizes.

The default LVM disk header:

  • Is found in label_header and pv_header structures.
  • Is in the second 512-byte sector of the disk. Note that if a non-default location was specified when creating the physical volume (PV), the header can also be in the first or third sector.

The standard LVM metadata area:

  • Begins 4096 bytes from the start of the disk.
  • Ends 1 MiB from the start of the disk.
  • Begins with a 512 byte sector containing the mda_header structure.

A metadata text area begins after the mda_header sector and goes to the end of the metadata area. LVM VG metadata text is written in a circular fashion into the metadata text area. The mda_header points to the location of the latest VG metadata within the text area.

You can print LVM headers from a disk by using the # pvck --dump headers /dev/sda command. This command prints label_header, pv_header, mda_header, and the location of metadata text if found. Bad fields are printed with the CHECK prefix.

The LVM metadata area offset will match the page size of the machine that created the PV, so the metadata area can also begin 8K, 16K or 64K from the start of the disk.

Larger or smaller metadata areas can be specified when creating the PV, in which case the metadata area may end at locations other than 1 MiB. The pv_header specifies the size of the metadata area.

When creating a PV, a second metadata area can be optionally enabled at the end of the disk. The pv_header contains the locations of the metadata areas.

17.9. Extracting VG metadata from a disk

Choose one of the following procedures to extract VG metadata from a disk, depending on your situation. For information about how to save extracted metadata, see Saving extracted metadata to a file.

Note

For repair, you can use backup files in /etc/lvm/backup/ without extracting metadata from disk.

Procedure

  • Print current metadata text as referenced from valid mda_header:

    # pvck --dump metadata <disk>

    Example 17.8. Metadata text from valid mda_header

    # pvck --dump metadata /dev/sdb
      metadata text at 172032 crc Oxc627522f # vgname test segno 59
      ---
      <raw metadata from disk>
      ---
  • Print the locations of all metadata copies found in the metadata area, based on finding a valid mda_header:

    # pvck --dump metadata_all <disk>

    Example 17.9. Locations of metadata copies in the metadata area

    # pvck --dump metadata_all /dev/sdb
      metadata at 4608 length 815 crc 29fcd7ab vg test seqno 1 id FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv
      metadata at 5632 length 1144 crc 50ea61c3 vg test seqno 2 id FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv
      metadata at 7168 length 1450 crc 5652ea55 vg test seqno 3 id FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv
  • Search for all copies of metadata in the metadata area without using an mda_header, for example, if headers are missing or damaged:

    # pvck --dump metadata_search <disk>

    Example 17.10. Copies of metadata in the metadata area without using an mda_header

    # pvck --dump metadata_search /dev/sdb
      Searching for metadata at offset 4096 size 1044480
      metadata at 4608 length 815 crc 29fcd7ab vg test seqno 1 id FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv
      metadata at 5632 length 1144 crc 50ea61c3 vg test seqno 2 id FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv
      metadata at 7168 length 1450 crc 5652ea55 vg test seqno 3 id FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv
  • Include the -v option in the dump command to show the description from each copy of metadata:

    # pvck --dump metadata -v <disk>

    Example 17.11. Showing description from each copy of metadata

    # pvck --dump metadata -v /dev/sdb
      metadata text at 199680 crc 0x628cf243 # vgname my_vg seqno 40
      ---
    my_vg {
    id = "dmEbPi-gsgx-VbvS-Uaia-HczM-iu32-Rb7iOf"
    seqno = 40
    format = "lvm2"
    status = ["RESIZEABLE", "READ", "WRITE"]
    flags = []
    extent_size = 8192
    max_lv = 0
    max_pv = 0
    metadata_copies = 0
    
    physical_volumes {
    
    pv0 {
    id = "8gn0is-Hj8p-njgs-NM19-wuL9-mcB3-kUDiOQ"
    device = "/dev/sda"
    
    device_id_type = "sys_wwid"
    device_id = "naa.6001405e635dbaab125476d88030a196"
    status = ["ALLOCATABLE"]
    flags = []
    dev_size = 125829120
    pe_start = 8192
    pe_count = 15359
    }
    
    pv1 {
    id = "E9qChJ-5ElL-HVEp-rc7d-U5Fg-fHxL-2QLyID"
    device = "/dev/sdb"
    
    device_id_type = "sys_wwid"
    device_id = "naa.6001405f3f9396fddcd4012a50029a90"
    status = ["ALLOCATABLE"]
    flags = []
    dev_size = 125829120
    pe_start = 8192
    pe_count = 15359
    }

This file can be used for repair. The first metadata area is used by default for dump metadata. If the disk has a second metadata area at the end of the disk, you can use the --settings "mda_num=2" option to use the second metadata area for dump metadata instead.

17.10. Saving extracted metadata to a file

If you need to use dumped metadata for repair, it is required to save extracted metadata to a file with the -f option and the --setings option.

Procedure

  • If -f <filename> is added to --dump metadata, the raw metadata is written to the named file. You can use this file for repair.
  • If -f <filename> is added to --dump metadata_all or --dump metadata_search, then raw metadata from all locations is written to the named file.
  • To save one instance of metadata text from --dump metadata_all|metadata_search add --settings "metadata_offset=<offset>" where <offset> is from the listing output "metadata at <offset>".

    Example 17.12. Output of the command

    # pvck --dump metadata_search --settings metadata_offset=5632 -f meta.txt /dev/sdb
      Searching for metadata at offset 4096 size 1044480
      metadata at 5632 length 1144 crc 50ea61c3 vg test seqno 2 id FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv
    # head -2 meta.txt
    test {
    id = "FaCsSz-1ZZn-mTO4-Xl4i-zb6G-BYat-u53Fxv"

17.11. Repairing a disk with damaged LVM headers and metadata using the pvcreate and the vgcfgrestore commands

You can restore metadata and headers on a physical volume that is either corrupted or replaced with a new device. You might be able to recover the data from the physical volume by rewriting the metadata area on the physical volume.

Warning

These instructions should be used with extreme caution, and only if you are familiar with the implications of each command, the current layout of the volumes, the layout that you need to achieve, and the contents of the backup metadata file. These commands have the potential to corrupt data, and as such, it is recommended that you contact Red Hat Global Support Services for assistance in troubleshooting.

Prerequisites

Procedure

  1. Collect the following information needed for the pvcreate and vgcfgrestore commands. You can collect the information about your disk and UUID by running the # pvs -o+uuid command.

    • metadata-file is the path to the most recent metadata backup file for the VG, for example, /etc/lvm/backup/<vg-name>
    • vg-name is the name of the VG that has the damaged or missing PV.
    • UUID of the PV that was damaged on this device is the value taken from the output of the # pvs -i+uuid command.
    • disk is the name of the disk where the PV is supposed to be, for example, /dev/sdb. Be certain this is the correct disk, or seek help, otherwise following these steps may lead to data loss.
  2. Recreate LVM headers on the disk:

    # pvcreate --restorefile <metadata-file> --uuid <UUID> <disk>

    Optionally, verify that the headers are valid:

    # pvck --dump headers <disk>
  3. Restore the VG metadata on the disk:

    # vgcfgrestore --file <metadata-file> <vg-name>

    Optionally, verify the metadata is restored:

    # pvck --dump metadata <disk>

If there is no metadata backup file for the VG, you can get one by using the procedure in Saving extracted metadata to a file.

Verification

  • To verify that the new physical volume is intact and the volume group is functioning correctly, check the output of the following command:
# vgs

17.12. Repairing a disk with damaged LVM headers and metadata using the pvck command

This is an alternative to the Repairing a disk with damaged LVM headers and metadata using the pvcreate and the vgcfgrestore commands. There may be cases where the pvcreate and the vgcfgrestore commands do not work. This method is more targeted at the damaged disk.

This method uses a metadata input file that was extracted by pvck --dump, or a backup file from /etc/lvm/backup. When possible, use metadata saved by pvck --dump from another PV in the same VG, or from a second metadata area on the PV. For more information, see Saving extracted metadata to a file.

Procedure

  • Repair the headers and metadata on the disk:

    # pvck --repair -f <metadata-file> <disk>

    where

    • <metadata-file> is a file containing the most recent metadata for the VG. This can be /etc/lvm/backup/vg-name, or it can be a file containing raw metadata text from the pvck --dump metadata_search command output.
    • <disk> is the name of the disk where the PV is supposed to be, for example, /dev/sdb. To prevent data loss, verify that is the correct disk. If you are not certain the disk is correct, contact Red Hat Support.
Note

If the metadata file is a backup file, the pvck --repair should be run on each PV that holds metadata in VG. If the metadata file is raw metadata that has been extracted from another PV, the pvck --repair needs to be run only on the damaged PV.

Verification

  • To check that the new physical volume is intact and the volume group is functioning correctly, check outputs of the following commands:

    # vgs <vgname>
    # pvs <pvname>
    # lvs <lvname>

17.13. Troubleshooting LVM RAID

You can troubleshoot various issues in LVM RAID devices to correct data errors, recover devices, or replace failed devices.

17.13.1. Checking data coherency in a RAID logical volume

LVM provides scrubbing support for RAID logical volumes. RAID scrubbing is the process of reading all the data and parity blocks in an array and checking to see whether they are coherent. The lvchange --syncaction repair command initiates a background synchronization action on the array. The following attributes provide details about data coherency:

  • The raid_sync_action field displays the current synchronization action that the RAID logical volume is performing. It can be one of the following values:

    idle
    Completed all sync actions (doing nothing).
    resync
    Initializing or resynchronizing an array after an unclean machine shutdown.
    recover
    Replacing a device in the array.
    check
    Looking for array inconsistencies.
    repair
    Looking for and repairing inconsistencies.
  • The raid_mismatch_count field displays the number of discrepancies found during a check action.
  • The Cpy%Sync field displays the progress of the sync actions.
  • The lv_attr field provides additional indicators. Bit 9 of this field displays the health of the logical volume, and it supports the following indicators:

    m or mismatches
    Indicates that there are discrepancies in a RAID logical volume. You can see this character after the scrubbing operation detects the portions of the RAID, which are not coherent.
    r or refresh
    Indicates a failed device in a RAID array, even though LVM can read the device label and considers the device to be operational. Refresh the logical volume to notify the kernel that the device is now available, or replace the device if you suspect that it failed.

Procedure

  1. Optional: Limit the I/O bandwidth that the scrubbing process uses. When you perform a RAID scrubbing operation, the background I/O required by the sync actions can crowd out other I/O to LVM devices, such as updates to volume group metadata. This might cause the other LVM operations to slow down.

    You can control the rate of the scrubbing operation by implementing recovery throttling. You can set the recovery rate using --maxrecoveryrate Rate[bBsSkKmMgG] or --minrecoveryrate Rate[bBsSkKmMgG] with the lvchange --syncaction commands. For more information, see Minimum and maximum I/O rate options.

    Specify the Rate value as an amount per second for each device in the array. If you provide no suffix, the options assume kiB per second per device.

  2. Display the number of discrepancies in the array, without repairing them:

    # lvchange --syncaction check my_vg/my_lv

    This command initiates a background synchronization action on the array.

  3. Optional: View the var/log/syslog file for the kernel messages.
  4. Correct the discrepancies in the array:

    # lvchange --syncaction repair my_vg/my_lv

    This command repairs or replaces failed devices in a RAID logical volume. You can view the var/log/syslog file for the kernel messages after executing this command.

Verification

  1. Display information about the scrubbing operation:

    # lvs -o +raid_sync_action,raid_mismatch_count my_vg/my_lv
    LV    VG    Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert SyncAction Mismatches
    my_lv my_vg rwi-a-r--- 500.00m                                    100.00           idle        0

Additional resources

17.13.2. Replacing a failed RAID device in a logical volume

RAID is not similar to traditional LVM mirroring. In case of LVM mirroring, remove the failed devices. Otherwise, the mirrored logical volume would hang while RAID arrays continue running with failed devices. For RAID levels other than RAID1, removing a device would mean converting to a lower RAID level, for example, from RAID6 to RAID5, or from RAID4 or RAID5 to RAID0.

Instead of removing a failed device and allocating a replacement, with LVM, you can replace a failed device that serves as a physical volume in a RAID logical volume by using the --repair argument of the lvconvert command.

Prerequisites

  • The volume group includes a physical volume that provides enough free capacity to replace the failed device.

    If no physical volume with enough free extents is available on the volume group, add a new, sufficiently large physical volume by using the vgextend utility.

Procedure

  1. View the RAID logical volume:

    # lvs --all --options name,copy_percent,devices my_vg
      LV               Cpy%Sync Devices
      my_lv            100.00   my_lv_rimage_0(0),my_lv_rimage_1(0),my_lv_rimage_2(0)
      [my_lv_rimage_0]          /dev/sde1(1)
      [my_lv_rimage_1]          /dev/sdc1(1)
      [my_lv_rimage_2]          /dev/sdd1(1)
      [my_lv_rmeta_0]           /dev/sde1(0)
      [my_lv_rmeta_1]           /dev/sdc1(0)
      [my_lv_rmeta_2]           /dev/sdd1(0)
  2. View the RAID logical volume after the /dev/sdc device fails:

    # lvs --all --options name,copy_percent,devices my_vg
      /dev/sdc: open failed: No such device or address
      Couldn't find device with uuid A4kRl2-vIzA-uyCb-cci7-bOod-H5tX-IzH4Ee.
      WARNING: Couldn't find all devices for LV my_vg/my_lv_rimage_1 while checking used and assumed devices.
      WARNING: Couldn't find all devices for LV my_vg/my_lv_rmeta_1 while checking used and assumed devices.
      LV               Cpy%Sync Devices
      my_lv            100.00   my_lv_rimage_0(0),my_lv_rimage_1(0),my_lv_rimage_2(0)
      [my_lv_rimage_0]          /dev/sde1(1)
      [my_lv_rimage_1]          [unknown](1)
      [my_lv_rimage_2]          /dev/sdd1(1)
      [my_lv_rmeta_0]           /dev/sde1(0)
      [my_lv_rmeta_1]           [unknown](0)
      [my_lv_rmeta_2]           /dev/sdd1(0)
  3. Replace the failed device:

    # lvconvert --repair my_vg/my_lv
      /dev/sdc: open failed: No such device or address
      Couldn't find device with uuid A4kRl2-vIzA-uyCb-cci7-bOod-H5tX-IzH4Ee.
      WARNING: Couldn't find all devices for LV my_vg/my_lv_rimage_1 while checking used and assumed devices.
      WARNING: Couldn't find all devices for LV my_vg/my_lv_rmeta_1 while checking used and assumed devices.
    Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
    Faulty devices in my_vg/my_lv successfully replaced.
  4. Optional: Manually specify the physical volume that replaces the failed device:

    # lvconvert --repair my_vg/my_lv replacement_pv
  5. Examine the logical volume with the replacement:

    # lvs --all --options name,copy_percent,devices my_vg
    
      /dev/sdc: open failed: No such device or address
      /dev/sdc1: open failed: No such device or address
      Couldn't find device with uuid A4kRl2-vIzA-uyCb-cci7-bOod-H5tX-IzH4Ee.
      LV               Cpy%Sync Devices
      my_lv            43.79    my_lv_rimage_0(0),my_lv_rimage_1(0),my_lv_rimage_2(0)
      [my_lv_rimage_0]          /dev/sde1(1)
      [my_lv_rimage_1]          /dev/sdb1(1)
      [my_lv_rimage_2]          /dev/sdd1(1)
      [my_lv_rmeta_0]           /dev/sde1(0)
      [my_lv_rmeta_1]           /dev/sdb1(0)
      [my_lv_rmeta_2]           /dev/sdd1(0)

    Until you remove the failed device from the volume group, LVM utilities still indicate that LVM cannot find the failed device.

  6. Remove the failed device from the volume group:

    # vgreduce --removemissing my_vg

Verification

  1. View the available physical volumes after removing the failed device:

    # pvscan
    PV /dev/sde1 VG rhel_virt-506 lvm2 [<7.00 GiB / 0 free]
    PV /dev/sdb1 VG my_vg lvm2 [<60.00 GiB / 59.50 GiB free]
    PV /dev/sdd1 VG my_vg lvm2 [<60.00 GiB / 59.50 GiB free]
    PV /dev/sdd1 VG my_vg lvm2 [<60.00 GiB / 59.50 GiB free]
  2. Examine the logical volume after the replacing the failed device:

    # lvs --all --options name,copy_percent,devices my_vg
    my_lv_rimage_0(0),my_lv_rimage_1(0),my_lv_rimage_2(0)
      [my_lv_rimage_0]          /dev/sde1(1)
      [my_lv_rimage_1]          /dev/sdb1(1)
      [my_lv_rimage_2]          /dev/sdd1(1)
      [my_lv_rmeta_0]           /dev/sde1(0)
      [my_lv_rmeta_1]           /dev/sdb1(0)
      [my_lv_rmeta_2]           /dev/sdd1(0)

Additional resources

  • lvconvert(8) and vgreduce(8) man pages

17.14. Troubleshooting duplicate physical volume warnings for multipathed LVM devices

When using LVM with multipathed storage, LVM commands that list a volume group or logical volume might display messages such as the following:

Found duplicate PV GDjTZf7Y03GJHjteqOwrye2dcSCjdaUi: using /dev/dm-5 not /dev/sdd
Found duplicate PV GDjTZf7Y03GJHjteqOwrye2dcSCjdaUi: using /dev/emcpowerb not /dev/sde
Found duplicate PV GDjTZf7Y03GJHjteqOwrye2dcSCjdaUi: using /dev/sddlmab not /dev/sdf

You can troubleshoot these warnings to understand why LVM displays them, or to hide the warnings.

17.14.1. Root cause of duplicate PV warnings

When a multipath software such as Device Mapper Multipath (DM Multipath), EMC PowerPath, or Hitachi Dynamic Link Manager (HDLM) manages storage devices on the system, each path to a particular logical unit (LUN) is registered as a different SCSI device.

The multipath software then creates a new device that maps to those individual paths. Because each LUN has multiple device nodes in the /dev directory that point to the same underlying data, all the device nodes contain the same LVM metadata.

Table 17.1. Example device mappings in different multipath software

Multipath softwareSCSI paths to a LUNMultipath device mapping to paths

DM Multipath

/dev/sdb and /dev/sdc

/dev/mapper/mpath1 or /dev/mapper/mpatha

EMC PowerPath

/dev/emcpowera

HDLM

/dev/sddlmab

As a result of the multiple device nodes, LVM tools find the same metadata multiple times and report them as duplicates.

17.14.2. Cases of duplicate PV warnings

LVM displays the duplicate PV warnings in either of the following cases:

Single paths to the same device

The two devices displayed in the output are both single paths to the same device.

The following example shows a duplicate PV warning in which the duplicate devices are both single paths to the same device.

Found duplicate PV GDjTZf7Y03GJHjteqOwrye2dcSCjdaUi: using /dev/sdd not /dev/sdf

If you list the current DM Multipath topology using the multipath -ll command, you can find both /dev/sdd and /dev/sdf under the same multipath map.

These duplicate messages are only warnings and do not mean that the LVM operation has failed. Rather, they are alerting you that LVM uses only one of the devices as a physical volume and ignores the others.

If the messages indicate that LVM chooses the incorrect device or if the warnings are disruptive to users, you can apply a filter. The filter configures LVM to search only the necessary devices for physical volumes, and to leave out any underlying paths to multipath devices. As a result, the warnings no longer appear.

Multipath maps

The two devices displayed in the output are both multipath maps.

The following examples show a duplicate PV warning for two devices that are both multipath maps. The duplicate physical volumes are located on two different devices rather than on two different paths to the same device.

Found duplicate PV GDjTZf7Y03GJHjteqOwrye2dcSCjdaUi: using /dev/mapper/mpatha not /dev/mapper/mpathc

Found duplicate PV GDjTZf7Y03GJHjteqOwrye2dcSCjdaUi: using /dev/emcpowera not /dev/emcpowerh

This situation is more serious than duplicate warnings for devices that are both single paths to the same device. These warnings often mean that the machine is accessing devices that it should not access: for example, LUN clones or mirrors.

Unless you clearly know which devices you should remove from the machine, this situation might be unrecoverable. Red Hat recommends that you contact Red Hat Technical Support to address this issue.

17.14.3. Example LVM device filters that prevent duplicate PV warnings

The following examples show LVM device filters that avoid the duplicate physical volume warnings that are caused by multiple storage paths to a single logical unit (LUN).

You can configure the filter for logical volume manager (LVM) to check metadata for all devices. Metadata includes local hard disk drive with the root volume group on it and any multipath devices. By rejecting the underlying paths to a multipath device (such as /dev/sdb, /dev/sdd), you can avoid these duplicate PV warnings, because LVM finds each unique metadata area once on the multipath device itself.

  • To accept the second partition on the first hard disk drive and any device mapper (DM) Multipath devices and reject everything else, enter:

    filter = [ "a|/dev/sda2$|", "a|/dev/mapper/mpath.*|", "r|.*|" ]
  • To accept all HP SmartArray controllers and any EMC PowerPath devices, enter:

    filter = [ "a|/dev/cciss/.*|", "a|/dev/emcpower.*|", "r|.*|" ]
  • To accept any partitions on the first IDE drive and any multipath devices, enter:

    filter = [ "a|/dev/hda.*|", "a|/dev/mapper/mpath.*|", "r|.*|" ]

17.14.4. Additional resources