The "lvmlockd" pacemaker resource enters a "FAILED" state when the lvmlockd service is started outside the cluster.
Issue
When lvmlockd
is running within a cluster, it goes into a FAILED
state when attempts are made to startoutside of the cluster. This additionally leads to fence of the the node experiencing a failure:
$ pcs status
------------------------------->8------------------------------
Node List:
* Node rhel8-node2: UNCLEAN (online)
* Online: [ rhel8-node1 ]
Full List of Resources:
* xvmfence (stonith:fence_xvm): Started rhel8-node1
* Clone Set: locking-clone [locking]:
* Resource Group: locking:0:
* dlm (ocf::pacemaker:controld): Started rhel8-node2
* lvmlockd (ocf::heartbeat:lvmlockd): FAILED rhel8-node2 <-----
* Started: [ rhel8-node1 ]
Failed Resource Actions:
* lvmlockd_monitor_30000 on rhel8-node2 'not running' (7): call=42, status='complete', last-rc-change='Fri Jun 2 17:03:46 2023', queued=0ms, exec=0ms
Pending Fencing Actions:
* reboot of rhel8-node2 pending: client=pacemaker-controld.2368, origin=rhel8-node1 <---
Ahead of the fence, the lvmlockd
service may additionally print the below messages to logs:
rhel8-node2 $ cat /var/log/messages
------------------------------->8------------------------------
Jun 2 13:51:55 rhel8-node2 systemd[1]: Starting LVM lock daemon...
Jun 2 13:51:55 rhel8-node2 lvmlockd[5078]: Cannot lock lockfile [/run/lvmlockd.pid], error was [Resource temporarily unavailable]
Jun 2 13:51:55 rhel8-node2 lvmlockd[5078]: Failed to acquire lock on /run/lvmlockd.pid. Already running?
Environment
- Red Hat Enterprise Linux (RHEL) 8 and 9 w/ High Availability Add On and Resilient Storage
- Pacemaker
lvmlockd
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.