Clusters - quorum - fencing - resources
Hi Guys,
I will really appreciate it if some of you will explain me in details how does the following things work (RHEL 6 HA with rgmanager):
- What exactly is the expected_votes parameter? Should it be in two-node cluster like node1-vote + node2-vote + quorum vote(1) /2 ? (1.5) Why?
- I noticed that even if I don't have fencing device, the node which has problem is fenced, i.e. one node lost all network connections, when the node is back it is fenced immidiately (see the message below):
Sep 08 11:04:48 fenced receive_start 2:12 add node with started_count 8
Sep 08 11:04:48 fenced fence domain default - membership is disallowed
Sep 08 11:04:48 fenced kick fence victim 2 from cluster
Sep 08 11:04:48 fenced telling cman to remove nodeid 2 from cluster
- I also notice that resource is not relocating during one node failure. I.e. I have two node cluster in VirtualBox. When I poweroff one node which had the resources running, the other node still sees that the powered off node has the resource and thus the resources are not relocating. What is the issue?
Responses
- expected_votes = minimum number of votes for cluster to remain quorate.. means operational.
- by default in a 2 node cluster, each node would get to cast one vote and this is treated as special case and would require tag as below in config file ..
<cman expected_votes="1" two_node="1"/>
You could refer to cluster documentation available at Red Hat site for more details.
regarding, fencing working without being configured, paste your cluster config file out here.. fencing is much needed as you sense out that resources are not relocating properly, as you know in an ideal situation with proper fencing configured (power/soft fencing), the failed node would be cut off and then resources would move... this is a strict rule without fencing a node resource relocation would not happen.. check out cluster document for more details..
All,
Their is one exception where fencing fails, that is when the fencing device is not reachable. E.g. If you power down a HP server, so the iLO goes offline iLO or IPMI fencing will not work. The second node will not take over the failing services for it is unable to determine the current status of the service.
Kind regards,
Jan Gerrit Kootstra
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
