clvmd startup timed out and/or lvm commands never return in RHEL 5, 6, or 7 Resilient Storage clusters
Issue
clvmd
times out on start
# service clvmd start
Starting clvmd: clvmd startup timed out
- LVM commands hang indefinitely waiting on a cluster lock
# vgscan -vvvv
#lvmcmdline.c:1070 Processing: vgscan -vvvv
#lvmcmdline.c:1073 O_DIRECT will be used
#libdm-config.c:789 Setting global/locking_type to 3
#libdm-config.c:789 Setting global/wait_for_locks to 1
#locking/locking.c:271 Cluster locking selected
- One of our nodes hung and the service wasn't working on the first active node. So we restarted the first node and the service properly failed over to the second node. However now the first node is joining the cluster but we can't start the shared service and clvmd is failing to start. It seems there is some lock on the logical volumes.
- An LVM command hanged with an error on a cluster node
Error locking on node <other node>: Command timed out
- The cluster state is as expected, nodes are not waiting on fencing or anything else, all nodes have quorum, etc but lvm commands just hang and don't return.
Environment
- Red Hat Enterprise Linux 5, 6, or 7 with the Resilient Storage Add On
lvm2-cluster
locking_type = 3
in/etc/lvm/lvm.conf
clvmd
- One or more volume groups with the clustered attribute set
fence_tool ls
shows await state
ofnone
(in RHEL 6), or other sources confirm that the cluster is not waiting on fencing or quorum- If the cluster is waiting on anything, then
lvm
commands hanging andclvmd
failing to start is expected behavior
- If the cluster is waiting on anything, then
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.