partial_activation=true in RHEL8 fails to activate VG in partial mode in pacemaker

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 8, 9 (with the High Availability Add-on)
  • resource-agents-4.1.1-98.el8_5.2.x86_64
  • HA-LVM using system_id

Issue

  • Configure a 2 nodes HA-LVM mirror logical volume resource that is in a RAID1 configuration in pacemaker cluster.
  • The LVM-activate resource has been configured with partial_activation=true attribute. The activation_mode parameter in /etc/lvm/lvm.conf is set to "degraded".
  • During cluster testing one node has been configured with a missing disk to test the scenario where one half of the RAID1 mirror volumes is unavailable. It is expected that the resource group will startup on the node with the missing disk with the RAID1 LVs in degraded state, but this does not happen. Attempting to start the LVM-Activate resource on the impacted node using debug-start and verbose mode, it appears that the lvchange is run with the --partial flag, but the vgchange is not.

Resolution

Red Hat Enterprise Linux 8

  • The issue is being tracked with bugzilla bug 2066156: Bug 2066156 - LVM-activate: partial_activation=true in RHEL8 fails to activate VG in partial mode in pacemaker (RHEL ). As of Wed, August 02 2023, the status of the bugzilla bug 2066156 is CLOSED. The bug has been deferred to the next release.

Red Hat Enterprise Linux 9

  • The issue (bugzilla bug: 2174911) has been resolved with the errata RHBA-2023:6312 with the following package(s):resource-agents-4.10.0-44.el9_3 or later.

The LVM-activate resource agent now supports two new options that allow volume group failover if the volume group is missing physical volumes:

  • The majoritypvs option allows the system ID to be changed on a volume group when a volume group is missing physical volumes, as long as a majority of physical volumes are present.
  • The degraded_activation option allows LVM RAID logical volumes in a volume group to be activated when legs are missing, as long as sufficient devices are available for LVM RAID to provide all the data in the logical volume.

Workaround

  • Manual LVM recovery.

Scenario 1: PV sdX is missing from HA-LVM mirror (RAID1) on node1 and node2

1) Remove missing PV from VG and mirror LV on the cluster node that owns VG.

 # vgreduce --removemissing --mirrorsonly --force <VGNAME>

2) Once missing PV sdX is restored on node1 and node2, repair the RAID LV on the cluster node that owns VG.

# vgck --updatemetadata <VGNAME>
# vgextend <VGNAME> /dev/sdX
# lvconvert --repair <VGNAME>/<LVNAME>

Scenario 2: PV sdX is missing from HA-LVM mirror (RAID1) on both nodes and the cluster node that "owns" the VG is not available, eg node1 is offlined

1) Identify the "owning" node and remove missing PV from VG and mirror LV.

[root@node2] # vgs --foreign --noheadings -o systemid  <VGNAME> 
node1
[root@node2 ]# vgreduce --config 'local/extra_system_ids=["node1"]' --removemissing --mirrorsonly --force VG_NAME

2) Once missing PV sdX is restored on node1 and node2, repair the RAID LV on node2 (non-owning node)

[root@node2]# vgck --updatemetadata <VGNAME>
[root@node2]# vgextend <VGNAME> /dev/sdX
[root@node2]# lvconvert --repair <VGNAME>/<LVNAME>

Root Cause

If there are missing devices on a volume group then the LVM-activate resource will not start.

Diagnostic Steps

  • Create 2 nodes RHEL8 HA-LVM cluster + LVM mirroring (RAID1 with 2 PVs)

    1) Ensure resource group is active on node1:

[root@node1]# pcs status
Cluster name: my_cluster
Cluster Summary:
  * Stack: corosync
  * Current DC: node1 (version 2.1.0-8.el8-7c3f660707) - partition with quorum
  * Last updated: Mon Mar 21 13:48:48 2022
  * Last change:  Mon Mar 21 13:40:36 2022 by root via crm_resource on node2
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ node1 node2 ]

Full List of Resources:
  * rhev_fence  (stonith:fence_rhevm):   Started node1
  * Resource Group: mirror_grp:
    * my_lvm    (ocf::heartbeat:LVM-activate):   Started node1
    * my_fs     (ocf::heartbeat:Filesystem):     Started node1

LVM mirroring is configured with RAID1 with 2 PVs.

# lvs -a -o+devices
  LV                   VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
  root                 rhel_vm37-135 -wi-ao---- <45.04g                                                     /dev/sdc2(1014)
  swap                 rhel_vm37-135 -wi-ao----   3.96g                                                     /dev/sdc2(0)
  mirror_lv            vg01          rwi-aor--- 500.00m                                    100.00           mirror_lv_rimage_0(0),mirror_lv_rimage_1(0)
  [mirror_lv_rimage_0] vg01          iwi-aor--- 500.00m                                                     /dev/sda(1)
  [mirror_lv_rimage_1] vg01          iwi-aor--- 500.00m                                                     /dev/sdb(1)
  [mirror_lv_rmeta_0]  vg01          ewi-aor---   4.00m                                                     /dev/sda(0)
  [mirror_lv_rmeta_1]  vg01          ewi-aor---   4.00m                                                     /dev/sdb(0)

2) Remove PV1 (echo 1>/sys/block/sdX/device/delete) on node 1

[root@node1]# echo 1 > /sys/block/sdb/device/delete

[root@node1]# lsblk
NAME                      MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                         8:0    0    1G  0 disk
├─vg01-mirror_lv_rmeta_0  253:2    0    4M  0 lvm
│ └─vg01-mirror_lv        253:6    0  500M  0 lvm  /mirror_fs
└─vg01-mirror_lv_rimage_0 253:3    0  500M  0 lvm
  └─vg01-mirror_lv        253:6    0  500M  0 lvm  /mirror_fs
sdc                         8:32   0   50G  0 disk
├─sdc1                      8:33   0    1G  0 part /boot
└─sdc2                      8:34   0   49G  0 part
  ├─rhel_vm37--135-root   253:0    0   45G  0 lvm  /
  └─rhel_vm37--135-swap   253:1    0    4G  0 lvm  [SWAP]
sr0                        11:0    1 1024M  0 rom

# lvs -a -o+devices

  WARNING: Couldn't find device with uuid 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj.
  WARNING: VG vg01 is missing PV 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj (last written to /dev/sdb).
  LV                   VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
  root                 rhel_vm37-135 -wi-ao---- <45.04g                                                     /dev/sdc2(1014)
  swap                 rhel_vm37-135 -wi-ao----   3.96g                                                     /dev/sdc2(0)
  mirror_lv            vg01          rwi-aor-p- 500.00m                                    100.00           mirror_lv_rimage_0(0),mirror_lv_rimage_1(0)
  [mirror_lv_rimage_0] vg01          iwi-aor--- 500.00m                                                     /dev/sda(1)
  [mirror_lv_rimage_1] vg01          Iwi-aor-p- 500.00m                                                     [unknown](1)
  [mirror_lv_rmeta_0]  vg01          ewi-aor---   4.00m                                                     /dev/sda(0)
  [mirror_lv_rmeta_1]  vg01          ewi-aor-p-   4.00m                                                     [unknown](0)

Resource group 'mirror_grp' is still active and running on node1, even though 1 PV is missing.

[root@node1]# pcs status
Cluster name: my_cluster
Cluster Summary:
  * Stack: corosync
  * Current DC: node1 (version 2.1.0-8.el8-7c3f660707) - partition with quorum
  * Last updated: Mon Mar 21 13:48:48 2022
  * Last change:  Mon Mar 21 13:40:36 2022 by root via crm_resource on node2
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ node1 node2 ]

Full List of Resources:
  * rhev_fence  (stonith:fence_rhevm):   Started node1
  * Resource Group: mirror_grp:
    * my_lvm    (ocf::heartbeat:LVM-activate):   Started node1
    * my_fs     (ocf::heartbeat:Filesystem):     Started node1

Actual Result: resource group is still running on node1.

3) Disable resource group,

[root@node1]# pcs resource disable mirror_grp

[root@node1]# pcs status
Cluster name: my_cluster
Cluster Summary:
  * Stack: corosync
  * Current DC: node1 (version 2.1.0-8.el8-7c3f660707) - partition with quorum
  * Last updated: Mon Mar 21 13:51:05 2022
  * Last change:  Mon Mar 21 13:51:00 2022 by root via cibadmin on node1
  * 2 nodes configured
  * 3 resource instances configured (2 DISABLED)

Node List:
  * Online: [ node1 node2 ]

Full List of Resources:
  * rhev_fence  (stonith:fence_rhevm):   Started node1
  * Resource Group: mirror_grp (disabled):
    * my_lvm    (ocf::heartbeat:LVM-activate):   Stopped (disabled)
    * my_fs     (ocf::heartbeat:Filesystem):     Stopped (disabled)

Actual result: successfully disabled resource group 'mirror_grp'

4) Node 1 : pcs resource debug-start LVM-activate-resource

[root@node1]# grep activation_mode /etc/lvm/lvm.conf
        # Configuration option activation/activation_mode.
         activation_mode = "degraded"

[root@node1]# pcs resource debug-start my_lvm
Operation force-start for my_lvm (ocf:heartbeat:LVM-activate) returned: 'ok' (0)
  pvscan[17562] PV /dev/sda online.
  pvscan[17562] PV /dev/sdc2 online.
active

  WARNING: Couldn't find device with uuid 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj.
  WARNING: Couldn't find device with uuid 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj.
  WARNING: VG vg01 is missing PV 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj (last written to /dev/sdb).
Mar 21 14:08:33 WARNING: Volume group inconsistency detected with missing device(s) and partial_activation enabled.  Proceeding with requested action.
Mar 21 14:08:33 INFO: Activating vg01
  WARNING: VG vg01 is missing PV 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj (last written to /dev/sdb).
  WARNING: Couldn't find device with uuid 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj.
  WARNING: VG vg01 is missing PV 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj (last written to /dev/sdb).
  Cannot change VG vg01 while PVs are missing.
  See vgreduce --removemissing and vgextend --restoremissing.
  Cannot process volume group vg01
Mar 21 14:08:33 INFO:  PARTIAL MODE. Incomplete logical volumes will be processed. WARNING: Couldn't find device with uuid 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj. WARNING: VG vg01 is missing PV 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj (last written to /dev/sdb).
Mar 21 14:08:33 INFO: vg01: activated successfully. <------

Actual Result: successfully started of VG in degraded mode, VG and PV activated.

5) Node 1 : pcs resource debug-stop LVM-activate-resource

[root@node1]# pcs resource debug-stop my_lvm

Operation force-stop for my_lvm (ocf:heartbeat:LVM-activate) returned: 'ok' (0)
Mar 21 14:16:17 INFO: Deactivating vg01
Mar 21 14:16:18 INFO:  PARTIAL MODE. Incomplete logical volumes will be processed. WARNING: Couldn't find device with uuid 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj. WARNING: VG vg01 is missing PV 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj (last written to /dev/sdb).
Mar 21 14:16:18 INFO: vg01: deactivated successfully.

Actual Result: successfully stopped VG in degraded mode. VG and PV deactivated.

6) Remove PV1 (echo 1>/sys/block/sdX/device/delete) on node 2

[root@node2 ~]# echo 1 > /sys/block/sdb/device/delete
[root@node2 ~]# lsblk
NAME                    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                       8:0    0    1G  0 disk
sdc                       8:32   0   50G  0 disk
├─sdc1                    8:33   0    1G  0 part /boot
└─sdc2                    8:34   0   49G  0 part
  ├─rhel_vm37--135-root 253:0    0   45G  0 lvm  /
  └─rhel_vm37--135-swap 253:1    0    4G  0 lvm  [SWAP]

7) Node2 : pcs resource debug-start LVM-activate-resource

[root@node2 ~]# pcs resource  debug-start my_lvm
crm_resource: Error performing operation: Error occurred
Operation force-start for my_lvm (ocf:heartbeat:LVM-activate) returned: 'error' (1)
  pvscan[20263] PV /dev/sda ignore foreign VG.
  pvscan[20263] PV /dev/sdc2 online.
active

  Cannot access VG vg01 with system ID node1 with local system ID node2.
Mar 21 14:23:09 INFO: Activating vg01
  WARNING: VG vg01 is missing PV 4fQbiH-XgxA-1c0s-fEuZ-svgt-7Dgm-Tk0kpj (last written to /dev/sdb).
  Cannot change VG vg01 while PVs are missing.
  See vgreduce --removemissing and vgextend --restoremissing.
  Cannot process volume group vg01
Mar 21 14:23:09 ERROR:  PARTIAL MODE. Incomplete logical volumes will be processed. Cannot access VG vg01 with system ID node1 with local system ID node2.
ocf-exit-reason:vg01: failed to activate.

Expected result: VG starts in degraded mode
Actual result: VG cannot start.

Running debug-start in full mode, it has failed at running "vgchange -y --config 'local/extra_system_ids=["node1"]' --systemid node2 vg01"

[root@node2 ~]# pcs resource  debug-start my_lvm --full

+ 14:29:04: systemid_activate:662: vgchange -y --config 'local/extra_system_ids=["node1"]' --systemid node2 vg01
File descriptor 9 (pipe:[276918]) leaked on vgchange invocation. Parent PID 20790: /bin/sh
  Cannot change VG vg01 while PVs are missing.
  See vgreduce --removemissing and vgextend --restoremissing.
  Cannot process volume group vg01

Looks like --partial option is not valid in this mode,

[root@node2 ~]# vgchange -y --config 'local/extra_system_ids=["node1"]' --systemid node2 vg01 --partial
  Command does not accept option: --partial.

Using explicit option activation/activation_mode="partial" is no luck either.

[root@node2 ~]# vgchange -y --config 'activation/activation_mode="partial" local/extra_system_ids=["node1"]' --systemid node2 vg01
  PARTIAL MODE. Incomplete logical volumes will be processed.
  Cannot change VG vg01 while PVs are missing.
  See vgreduce --removemissing and vgextend --restoremissing.
  Cannot process volume group vg01

Using 'local/extra_system_ids=["node2"]' to override systemid has failed too,

[root@node2]# vgchange -ay --config  'local/extra_system_ids=["node2"]'  vg01 --partial
  PARTIAL MODE. Incomplete logical volumes will be processed.
  Cannot access VG vg01 with system ID node1 with local system ID node2.
  • So in summary, VG can be activated in degraded mode on node1 (cluster node that owns the VG) but fail to activate VG in degraded mode on node2.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments