LVM raid1 mirror crashed with IO errors

Solution Verified - Updated -

Issue

  • While testing the raid1 lvm volumes by failing one disk at a time, it was observed that a failure of 2nd disk triggers IO errors on filesystems. Below are the steps used for testing:

  • At the beginning of test, the lvm raid1 volume has both mirror legs in sync:

    $ lvs -a -o+devices
    LV              VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
    testlv            testvg        rwi-aor--- 200.00m                                    100.00           testlv_rimage_0(0),testlv_rimage_1(0)
    [testlv_rimage_0] testvg        iwi-aor--- 200.00m                                  /dev/mapper/mpatha(0)                
    [testlv_rimage_1] testvg        iwi-aor--- 200.00m                                  /dev/mapper/mpathb(1)                
    [testlv_rmeta_0]  testvg        ewi-aor---   4.00m                                  /dev/mapper/mpatha(50)               
    [testlv_rmeta_1]  testvg        ewi-aor---   4.00m                                  /dev/mapper/mpathb(0)     
    
    1. Fail the first PV /dev/mapper/mpatha (at Disk Array side), there are following errors logged in /var/log/messages:

      kernel: md/raid1:mdX: Disk failure on dm-8, disabling device.
              md/raid1:mdX: Operation continuing on 1 devices.
      lvm[1121600]: WARNING: Device #0 of raid1 array, testvg-testlv, has failed.
      lvm[1121600]: WARNING: waiting for resynchronization to finish before initiating repair on RAID device testvg-testlv.
      kernel: blk_update_request: I/O error, dev dm-2, sector 0 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
      lvm[1121600]: WARNING: Couldn't find device with uuid xxxxx-yyyy-zzzz-aaaa-bbbb-cccc-ddddd.
      lvm[1121600]: WARNING: VG testvg is missing PV xxxxx-yyyy-zzzz-aaaa-bbbb-cccc-ddddd (last written to /dev/mapper/mpatha).
      lvm[1121600]: WARNING: Couldn't find all devices for LV testvg/testlv_rmeta_0 while checking used and assumed devices.
      lvm[1121600]: WARNING: Couldn't find all devices for LV testvg/testlv_rimage_0 while checking used and assumed devices.
      lvm[1121600]: WARNING: Couldn't find device with uuid xxxxx-yyyy-zzzz-aaaa-bbbb-cccc-ddddd.
      lvm[1121600]: Use 'lvconvert --repair testvg/testlv' to replace failed device.
      

      With failure of 1 mirror leg, the lvm volume/filesystem is still accessible.

    2. Add the PV (mpatha) back:

      kernel: sd 0:0:0:2: [sdc] Attached SCSI disk
      multipathd[694889]: mpatha: load table [0 2097152 multipath 0 0 1 1 service-time 0 1 1 8:32 1]
      multipathd[694889]: sdc [8:32]: path added to devmap mpatha
      

      The lvm volume shows both legs in sync:

      $ lvs -a -o+devices
      LV              VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
      testlv            testvg        rwi-aor-r- 200.00m                                    100.00           testlv_rimage_0(0),testlv_rimage_1(0)
      [testlv_rimage_0] testvg        Iwi-aor-r- 200.00m                                  /dev/mapper/mpatha(0)                
      [testlv_rimage_1] testvg        iwi-aor--- 200.00m                                  /dev/mapper/mpathb(1)                
      [testlv_rmeta_0]  testvg        ewi-aor-r-   4.00m                                  /dev/mapper/mpatha(50)               
      [testlv_rmeta_1]  testvg        ewi-aor---   4.00m                                  /dev/mapper/mpathb(0)
      
    3. Now, fail the 2nd PV /dev/mapper/mpathb (at Disk Array side)
      At this stage, the raid1 volume could not handle the IOs and XFS filesystem shows below errors:

      multipathd[694889]: sdd [8:48]: path removed from map mpathb
      kernel: blk_update_request: I/O error, dev dm-3, sector 2056 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
      kernel: md: super_written gets error=-5
      kernel: blk_update_request: I/O error, dev dm-3, sector 2048 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
      kernel: md: super_written gets error=-5
      kernel: blk_update_request: I/O error, dev dm-3, sector 215160 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
      kernel: XFS (dm-6): log I/O error -5
      kernel: XFS (dm-6): xfs_do_force_shutdown(0x2) called from line 1210 of file fs/xfs/xfs_log.c. Return address = 00000000134510ec
      

Environment

  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 8
  • LVM RAID1 volumes

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content