Service with lvm resource using tagging attempts to recover on passive node and fails with "Someone else owns this volume group" instead of failing to stop when a node loses access to storage in a RHEL 6 High Availability cluster
Issue
- When the active node loses all access to storage while the cluster is using HA-LVM with Tagging, any service using LVM resources should fail outright as we're unable to appropriately remove tags. Instead, the service is attempting to recover on the passive node unsuccessfully
- When a storage device fails, the
<fs/>
resource fails its status check, the service stops, and then recovery on the backup node fails with "Someone else owns this volume group"
Aug 14 11:15:31 rgmanager [fs] fs:dataFS: is_alive: failed read test on [/data]. Return code: 2
Aug 14 11:15:31 rgmanager [fs] fs:dataFS: Mount point is not accessible!
Aug 14 11:15:31 rgmanager status on fs "dataFS" returned 1 (generic error)
Aug 14 11:15:31 rgmanager Stopping service service:fs_service
Aug 14 11:15:31 rgmanager [ip] Removing IPv4 address 192.168.2.140/24 from bond0
Aug 14 11:15:41 rgmanager [fs] unmounting /data
Aug 14 11:15:43 rgmanager [lvm] HA LVM: Unable to get volume group attributes for clustVG
Aug 14 11:15:43 rgmanager [lvm] WARNING: An improper setup can cause data corruption!
Aug 14 11:15:45 rgmanager Service service:fs_service is recovering
Aug 14 11:15:55 rgmanager #2: Service service:fs_service returned failure code. Last Owner: node1
Aug 14 11:15:45 rgmanager Recovering failed service service:fs_service
Aug 14 11:15:46 rgmanager [lvm] Starting volume group, clustVG
Aug 14 11:15:47 rgmanager [lvm] node1 owns clustVG and is still a cluster member
Aug 14 11:15:47 rgmanager [lvm] Someone else owns this volume group
Aug 14 11:15:47 rgmanager start on lvm "clustVG-lvm" returned 1 (generic error)
Aug 14 11:15:47 rgmanager #68: Failed to start service:fs_service; return value: 1
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
rgmanager
- HA-LVM using the tagging variant
<lvm/>
resource withoutlv_name
specified, or with it left blank- All storage devices backing physical volumes in the volume group experience a simultaneous failure on one node, but not another
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.