clvmd or other services hang while node rejoins cluster, `cman_tool services` shows services in FAIL_ALL_STOPPED, FAIL_START_WAIT, or JOIN_START_WAIT throughout cluster
Issue
- After a node was removed from the cluster, attempted to rejoin, and removed again shortly after,
clvmdwon't start or GFS/GFS2 file systems can't be mounted.cman_tool servicesshows one or more services in bad states:
# cman_tool services
type level name id state
fence 0 default 0001000a FAIL_ALL_STOPPED
[1 2 3 4 5 6 7 8 9 10 11 12 13 14]
dlm 1 clvmd 00010003 none
[1 2 3 4 5 6 7 8 9 10 11 12 13]
dlm 1 gfsdata 00020005 none
[1 2 3 4 5 6 7 8 9 10 11 12 13]
dlm 1 rgmanager 00010002 none
[1 2 3 4 5 6 7 8 9 10 11 12 13]
gfs 2 gfsdata 00010005 FAIL_START_WAIT
[1 2 3 4 5 6 7 8 9 10 11 12 13]
- After a node is removed from the cluster that node is unable to mount GFS filesystems. When starting clvmd it hangs.
Environment
- Red Hat Enterprise Linux Server 5 (with the High Availability Add on)
- A node has been removed from the cluster, rejoined, and then removed again shortly after
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
