cman and qdisk fail to start on some (not all) nodes in the cluster following a simultaneous cluster-wide reboot in RHEL 6

Solution Unverified - Updated -

Issue

  • If my heuristic fails across the cluster causing all nodes to reboot, one node gets evicted on startup.
  • If all nodes in a cluster reboot uncleanly at the same time, and then node2 happens to come up slightly ahead of node1, node1 cannot start cman:
Starting qdisk... cman_tool: Cannot open connection to cman, is it running ?
cman_tool: Cannot open connection to cman, is it running ?

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add on
  • Cluster utilizes a quorum device (<quorumd> in /etc/cluster/cluster.conf

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content