8.6. Fencing the Controller Nodes

Fencing is the process of isolating a node to protect a cluster and its resources. Without fencing, a faulty node can cause data corruption in a cluster.
The director uses Pacemaker to provide a highly available cluster of Controller nodes. Pacemaker uses a process called STONITH (Shoot-The-Other-Node-In-The-Head) to help fence faulty nodes. By default, STONITH is disabled on your cluster and requires manual configuration so that Pacemaker can control the power management of each node in the cluster.

Note

Login to each node as the heat-admin user from the stack user on the director. The Overcloud creation automatically copies the stack user's SSH key to each node's heat-admin.
Verify you have a running cluster with pcs status:
  $ sudo pcs status
  Cluster name: openstackHA
  Last updated: Wed Jun 24 12:40:27 2015
  Last change: Wed Jun 24 11:36:18 2015
  Stack: corosync
  Current DC: lb-c1a2 (2) - partition with quorum
  Version: 1.1.12-a14efad
  3 Nodes configured
  141 Resources configured
Verify that stonith is disabled with pcs property show:
$ sudo pcs property show
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: openstackHA
dc-version: 1.1.12-a14efad
have-watchdog: false
stonith-enabled: false
The Controller nodes contain a set of fencing agents for the various power management devices the director supports. This includes:

Table 8.1. Fence Agents

Device
Type
fence_ipmilan
The Intelligent Platform Management Interface (IPMI)
fence_idrac, fence_drac5
Dell Remote Access Controller (DRAC)
fence_ilo
Integrated Lights-Out (iLO)
fence_ucs
fence_xvm, fence_virt
Libvirt and SSH
The rest of this section uses the IPMI agent (fence_ipmilan) as an example.
View a full list of IPMI options that Pacemaker supports:
$ sudo pcs stonith describe fence_ipmilan
Each node requires configuration of IPMI devices to control the power management. This involves adding a stonith device to Pacemaker for each node. Use the following commands for the cluster:

Note

The second command in each example is to prevent the node from asking to fence itself.
For Controller node 0:
$ sudo pcs stonith create my-ipmilan-for-controller-0 fence_ipmilan pcmk_host_list=overcloud-controller-0 ipaddr=192.0.2.205 login=admin passwd=p@55w0rd! lanplus=1 cipher=1 op monitor interval=60s
$ sudo pcs constraint location my-ipmilan-for-controller-0 avoids overcloud-controller-0
For Controller node 1:
$ sudo pcs stonith create my-ipmilan-for-controller-1 fence_ipmilan pcmk_host_list=overcloud-controller-1 ipaddr=192.0.2.206 login=admin passwd=p@55w0rd! lanplus=1 cipher=1 op monitor interval=60s
$ sudo pcs constraint location my-ipmilan-for-controller-1 avoids overcloud-controller-1
For Controller node 2:
$ sudo pcs stonith create my-ipmilan-for-controller-2 fence_ipmilan pcmk_host_list=overcloud-controller-2 ipaddr=192.0.2.207 login=admin passwd=p@55w0rd! lanplus=1 cipher=1 op monitor interval=60s
$ sudo pcs constraint location my-ipmilan-for-controller-2 avoids overcloud-controller-2
Run the following command to see all stonith resources:
$ sudo pcs stonith show
Run the following command to see a specific stonith resource:
$ sudo pcs stonith show [stonith-name]
Finally, enable fencing by setting the stonith property to true:
$ sudo pcs property set stonith-enabled=true
Verify the property:
$ sudo pcs property show