clvmd startup timed out and/or lvm commands never return in RHEL 5, 6, or 7 Resilient Storage clusters

Solution Verified - Updated -

Issue

  • clvmd times out on start
# service clvmd start
Starting clvmd: clvmd startup timed out
  • LVM commands hang indefinitely waiting on a cluster lock
# vgscan -vvvv
#lvmcmdline.c:1070       Processing: vgscan -vvvv
#lvmcmdline.c:1073       O_DIRECT will be used
#libdm-config.c:789    Setting global/locking_type to 3
#libdm-config.c:789    Setting global/wait_for_locks to 1
#locking/locking.c:271    Cluster locking selected
  • One of our nodes hung and the service wasn't working on the first active node. So we restarted the first node and the service properly failed over to the second node. However now the first node is joining the cluster but we can't start the shared service and clvmd is failing to start. It seems there is some lock on the logical volumes.
  • An LVM command hanged with an error on a cluster node
Error locking on node <other node>: Command timed out
  • The cluster state is as expected, nodes are not waiting on fencing or anything else, all nodes have quorum, etc but lvm commands just hang and don't return.

Environment

  • Red Hat Enterprise Linux 5, 6, or 7 with the Resilient Storage Add On
  • lvm2-cluster
    • locking_type = 3 in /etc/lvm/lvm.conf
    • clvmd
    • One or more volume groups with the clustered attribute set
  • fence_tool ls shows a wait state of none (in RHEL 6), or other sources confirm that the cluster is not waiting on fencing or quorum
    • If the cluster is waiting on anything, then lvm commands hanging and clvmd failing to start is expected behavior

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content