RHEL 5 or 6 High Availability cluster with high post_fail_delay doesn't fence a node after it has left
Issue
- The other nodes observed token loss from one node, another node deferred fencing to the lowest node, however we don't see a fence attempt made there.
- I see a "processor failed" from
corosync
, but it doesn't fence,rgmanager
never takes over any resources, and then the node eventually rejoins the cluster and everything starts working again.
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
<fence_daemon post_fail_delay/>
in/etc/cluster/cluster.conf
is set to a high value
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.