What are my options for avoiding fence races in High Availability clusters with an even number of nodes?
Issue
- How can fence races be avoided in 2-node RHEL clusters?
- When a network split occurs in a 2-node cluster, both nodes race to fence each other and the winner is not deterministic
- Do I need a quorum device in a 2-node cluster to avoid fence races?
- When I disconnect the heartbeat interface, the cluster goes completely down. The node with no cluster resources has been fenced and the node with all resources has halted itself..why?
- With four nodes and a quorum device, I've seen that if there is a network split down the middle that creates 2 two-node partitions, both sides can race to fence each other.
Environment
- Red Hat Enterprise Linux (RHEL) 5 and later with the High Availability Add-On
- A cluster with an even number of nodes
- Two-node clusters are most commonly affected
- Larger clusters with an even number of nodes can also be affected if using a quorum device or some other mechanism that would allow two halves of a cluster to stay quorate independently
- Network-based fence devices that are accessed over a different network interface from that which is used for cluster communication
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.