After corosync segfaults, qdiskd evicts other cluster nodes on Red Hat Enterprise Linux 6
Issue
- The following issue was occurred on our 3 node cluster suddenly:
- Added a new service, and started this service from Luci.
- After starting the service, node1 and node3 were rebooted suddenly.
- We found that corosync suffered a SIGABRT and coredumped on node2
Environment
- Red Hat Enterprise Linux Server 6 (with the High Availability and Resilient Storage Add Ons)
- Red Hat High Availability Cluster with 2 or more nodes
- Quorum disk configured
- Issue is triggered by a failure of a cluster process (observed when corosync segfaulted)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.