5.2. Deploying and testing fencing on the overcloud

The fencing configuration process includes the following stages:

  1. Reviewing the state of STONITH and Pacemaker.
  2. Generating the fencing.yaml file.
  3. Redeploying the overcloud and testing the configuration.

Prerequisites

Make sure that you can access the nodes.json file that you created when you registered your Controller nodes in director. This file is a required input for the fencing.yaml file that you generate during deployment.

Review the state of STONITH and Pacemaker

  1. Log in to each Controller node as the heat-admin user.
  2. Verify that the cluster is running:

    $ sudo pcs status

    Example output:

    Cluster name: openstackHA
    Last updated: Wed Jun 24 12:40:27 2015
    Last change: Wed Jun 24 11:36:18 2015
    Stack: corosync
    Current DC: lb-c1a2 (2) - partition with quorum
    Version: 1.1.12-a14efad
    3 Nodes configured
    141 Resources configured
  3. Verify that STONITH is disabled:

    $ sudo pcs property show

    Example output:

    Cluster Properties:
    cluster-infrastructure: corosync
    cluster-name: openstackHA
    dc-version: 1.1.12-a14efad
    have-watchdog: false
    stonith-enabled: false

Generate the fencing.yaml environment file

Choose one of the following options:

  • If you use the IPMI or Red Hat Virtualization (RHV) fencing agent, run the following command to generate the fencing.yaml environment file:

    $ openstack overcloud generate fencing --output fencing.yaml nodes.json
    注記
    • This command converts ilo and drac power management details to IPMI equivalents.
    • Make sure that the nodes.json file contains the MAC address of one of the network interfaces (NICs) on the node. For more information, see Registering Nodes for the Overcloud.
    • If you use RHV, make sure that you use a role with permissions to create and launch virtual machine, such as UserVMManager.
  • If you use a different fencing agent, such as Storage Block Device (SBD), fence_kdump, or Redfish, generate the fencing.yaml file manually.

    注記

    If you use pre-provisioned nodes, you also must create the fencing.yaml file manually.

For more information about supported fencing agents, see 「Supported fencing agents」.

Redeploy the overcloud and test the configuration

  1. Run the overcloud deploy command and include the fencing.yaml file that you generated to configure fencing on the Controller nodes:

    openstack overcloud deploy --templates \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
    -e ~/templates/network-environment.yaml \
    -e ~/templates/storage-environment.yaml --control-scale 3 --compute-scale 3 --ceph-storage-scale 3 --control-flavor control --compute-flavor Compute --ceph-storage-flavor ceph-storage --ntp-server pool.ntp.org --neutron-network-type vxlan --neutron-tunnel-types vxlan \
    -e fencing.yaml
  2. Log in to the overcloud and verify that fencing is configured for each of the Controller nodes:

    1. Check that Pacemaker is configured as the resource manager:

      $ source stackrc
      $ nova list | grep controller
      $ ssh heat-admin@<controller-x_ip>
      $ sudo pcs status |grep fence
      stonith-overcloud-controller-x (stonith:fence_ipmilan): Started overcloud-controller-y

      In this example, Pacemaker is configured to use a STONITH resource for each of the Controller nodes that are specified in the fencing.yaml file.

      注記

      You must not configure the fence-resource process on the same node that it controls.

    2. Run the pcs stonith show command to check the fencing resource attributes:

      $ sudo pcs stonith show <stonith-resource-controller-x>

      The STONITH attribute values must match the values in the fencing.yaml file.

Verify fencing on the Controller nodes

To test whether fencing works correctly, you trigger fencing by closing all ports on a Controller node and rebooting the server.

  1. Log in to a Controller node:

    $ source stackrc
    $ nova list |grep controller
    $ ssh heat-admin@<controller-x_ip>
  2. Change to the root user and run the iptables command on each port:

    $ sudo -i
    iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT &&
    iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT &&
    iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 5016 -j ACCEPT &&
    iptables -A INPUT -p udp -m state --state NEW -m udp --dport 5016 -j ACCEPT &&
    iptables -A INPUT ! -i lo -j REJECT --reject-with icmp-host-prohibited &&
    iptables -A OUTPUT -p tcp --sport 22 -j ACCEPT &&
    iptables -A OUTPUT -p tcp --sport 5016 -j ACCEPT &&
    iptables -A OUTPUT -p udp --sport 5016 -j ACCEPT &&
    iptables -A OUTPUT ! -o lo -j REJECT --reject-with icmp-host-prohibited
    重要

    This step drops all connections to the Controller node, which causes the server to reboot.

  3. From a different Controller node, locate the fencing event in the Pacemaker log file:

    $ ssh heat-admin@<controller-x_ip>
    $ less /var/log/cluster/corosync.log
    (less): /fenc*

    If the STONITH service performed the fencing action on the Controller, the log file will show a fencing event.

  4. Wait a few minutes and then verify that the rebooted Controller node is running in the cluster again by running the pcs status command.