cman and qdisk fail to start on some (not all) nodes in the cluster following a simultaneous cluster-wide reboot in RHEL 6

Solution Unverified - Updated -

Issue

  • If my heuristic fails across the cluster causing all nodes to reboot, one node gets evicted on startup.
  • If all nodes in a cluster reboot uncleanly at the same time, and then node2 happens to come up slightly ahead of node1, node1 cannot start cman:
Starting qdisk... cman_tool: Cannot open connection to cman, is it running ?
cman_tool: Cannot open connection to cman, is it running ?

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add on
  • Cluster utilizes a quorum device (<quorumd> in /etc/cluster/cluster.conf

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.