cman and qdisk fail to start on some (not all) nodes in the cluster following a simultaneous cluster-wide reboot in RHEL 6
Issue
- If my heuristic fails across the cluster causing all nodes to reboot, one node gets evicted on startup.
- If all nodes in a cluster reboot uncleanly at the same time, and then node2 happens to come up slightly ahead of node1, node1 cannot start cman:
Starting qdisk... cman_tool: Cannot open connection to cman, is it running ?
cman_tool: Cannot open connection to cman, is it running ?
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add on
- Cluster utilizes a quorum device (
<quorumd>in/etc/cluster/cluster.conf
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.