Use High Availability to Protect Instances in Red Hat Enterprise Linux OpenStack Platform 7

Updated -

This article describes a procedure for protecting your instances using High Availability. The HA backend is Pacemaker, which adds the ability to detect (and predictably respond to) Compute node failures.

Note: This article assumes that you have an existing Red Hat Enterprise Linux OpenStack Platform 7 environment that was deployed using director, and has been configured in a fully HA state.

Environment requirements and assumptions

The following requirements and assumptions are made about the environment:

  • The environment was deployed using RHEL OpenStack Platform director.
  • Fencing has already manually been enabled on the control plane.
  • No overcloud stack updates will be run following the configuration of instance HA.
  • The following packages installed on all nodes:
    • fence-agents-all-4.0.11-27.el7_2.5.x86_64 (or greater)
    • pacemaker-1.1.13-10.el7_2.2.x86_64 (or greater)
    • resource-agents-3.9.5-54.el7_2.6.x86_64 (or greater)
  • A full outage of both the Compute and control planes will be required.
  • Shared storage is enabled within the environment for ephemeral and block storage. See the note below for an exception to this requirement.

Exception for shared storage

Typically, this configuration requires shared storage. If you attempt to use the no-shared-storage option, you are likely to receive an InvalidSharedStorage error during evacuation, and instances will not power up on the other node. However, if all your instances are configured to boot up from a Block Storage (cinder) volume, then you will not need shared storage for storing the disk image of instances; you will be able to evacuate all instances using the no-shared-storage option. During evacuation, if your instances are configured to boot from a Block Storage (cinder) volume, any evacuated instances can be expected to boot up from the same cinder volume, but on another Compute node. As a result, the evacuated instances are able to immediately restart their jobs, as the OS image and application data are kept on the Cinder volume.

If your deployment includes this use case, include the no_shared_storage=1 option in step 7.

Installation

1. Begin by stopping and disabling libvirtd and all OpenStack services on the Compute nodes:

heat-admin@compute-n # sudo openstack-service stop
heat-admin@compute-n # sudo openstack-service disable
heat-admin@compute-n # sudo systemctl stop libvirtd
heat-admin@compute-n # sudo systemctl disable libvirtd

2. Create an authentication key for use with pacemaker-remote.

Perform this step on one of the Compute nodes:

heat-admin@compute-1 # sudo mkdir -p /etc/pacemaker/
heat-admin@compute-1 # sudo dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
heat-admin@compute-1 # sudo cp /etc/pacemaker/authkey ./
heat-admin@compute-1 # sudo chown heat-admin:heat-admin authkey

3. Copy this key to the director node, and then to the remaining Compute and Controller nodes:

stack@director # scp heat-admin@compute-1:~/ ./
stack@director # scp authkey heat-admin@node-n:~/
heat-admin@node-n # sudo mkdir -p --mode=0750 /etc/pacemaker/
heat-admin@node-n # sudo chgrp haclient /etc/pacemaker
heat-admin@node-n # sudo mv authkey /etc/pacemaker/
heat-admin@node-n # sudo chown root:haclient /etc/pacemaker/authkey

4. Enable pacemaker-remote on all Compute nodes:

heat-admin@compute-n # sudo systemctl enable pacemaker_remote
heat-admin@compute-n # sudo systemctl start pacemaker_remote

5. Confirm that the required versions of the pacemaker (1.1.13-10.el7_2.2.x86_64), fence-agents (fence-agents-all-4.0.11-27.el7_2.5.x86_64) and resource-agents (3.9.5-54.el7_2.6.x86_64`) packages are installed on the controller and Compute nodes:

heat-admin@controller-n # sudo rpm -qa | egrep '(pacemaker|fence-agents|resource-agents)'

6.a Apply the following constraint workarounds required for BZ#1257414:

Note: This issue has been addressed in RHSA-2015:1862 and might not be required for your environment.

heat-admin@controller-1 # sudo pcs constraint order start openstack-nova-novncproxy-clone then openstack-nova-api-clone
heat-admin@controller-1 # sudo pcs constraint order start rabbitmq-clone then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order promote galera-master then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order start haproxy-clone then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order start memcached-clone then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order promote redis-master then start openstack-ceilometer-central-clone require-all=false
heat-admin@controller-1 # sudo pcs resource defaults resource-stickiness=INFINITY

6.b Apply the following constraint workarounds required for BZ#1295835:

Note: This issue has been addressed in RHBA-2016:0264-1 and might not be required for your environment.

sudo pcs config | grep systemd | awk '{print $2}' | while read RESOURCE; do sudo pcs resource update $RESOURCE op start timeout=200s op stop timeout=200s; done"

7. Create a NovaEvacuate active/passive resource using the overcloudrc file to provide the auth_url, username, tenant and password values:

stack@director # scp overcloudrc heat-admin@controller-1:~/
heat-admin@controller-1 # . ~/overcloudrc
heat-admin@controller-1 # sudo pcs resource create nova-evacuate ocf:openstack:NovaEvacuate auth_url=$OS_AUTH_URL username=$OS_USERNAME \
password=$OS_PASSWORD tenant_name=$OS_TENANT_NAME

Note: If you are not using shared storage, include the no_shared_storage=1 option in your resource create ... command above. See Exception for shared storage for more information.

8. Confirm that nova-evacuate is started after the floating IP resources, and the Image Service (glance), OpenStack Networking (neutron), Compute (nova) services:

heat-admin@controller-1 # for i in $(sudo pcs status | grep IP | awk '{ print $1 }'); do sudo pcs constraint order start $i then nova-evacuate ; done
heat-admin@controller-1 # for i in openstack-glance-api-clone neutron-metadata-agent-clone openstack-nova-conductor-clone; do \
sudo pcs constraint order start $i then nova-evacuate require-all=false ; done

9. Disable all OpenStack resources across the control plane:

heat-admin@controller-1 # sudo pcs resource disable openstack-keystone --wait=540s

Depending on the time needed to stop Identity Service (and on the power of your hardware) you can consider increasing the timeout period (--wait).

Note: This timeout value of 540 is only an example. If you experience issues, you can calculate a timeout period suitable for your environment.
For example, a typical deployment using RHEL OpenStack Platform director will need to consider the timeout period allocated for each service:

  • Identity Service:

    • openstack-keystone - 120s
  • Telemetry:

    • openstack-ceilometer-central - 120s
    • openstack-ceilometer-collector - 120s
    • openstack-ceilometer-api - 120s
    • openstack-ceilometer-alarm-evaluator - 120s
    • openstack-ceilometer-notification - 120s
  • Compute:

    • openstack-nova-consoleauth - 120s
    • openstack-nova-novncproxy - 120s
    • openstack-nova-api - 120s
    • openstack-nova-scheduler - 120s
    • openstack-nova-conductor - 120s
  • OpenStack Networking:

    • neutron-server - 120s
    • neutron-openvswitch-agent - 120s
    • neutron-dhcp-agent - 120s
    • neutron-l3-agent - 120s
    • neutron-metadata-agent - 120s

In this example, Telemetry, Compute, and OpenStack Networking each use a total duration of 600s. This can be considered a suitable value to begin testing with. You can validate your timeout calculations using pcs resource:

controller# pcs resource show openstack-ceilometer-central

   Resource: openstack-ceilometer-central (class=systemd type=openstack-ceilometer-central)
    Operations: start interval=0s timeout=120s
   (openstack-ceilometer-central-start-interval-0s)
                monitor interval=60s
   (openstack-ceilometer-central-monitor-interval-60s)
                stop interval=0s timeout=120s
   (openstack-ceilometer-central-stop-interval-0s)

10. Create a list of the current controllers using cibadmin data :

heat-admin@controller-1 # controllers=$(sudo cibadmin -Q -o nodes | grep uname | sed s/.*uname..// | awk -F\" '{print $1}')
heat-admin@controller-1 # echo $controllers

11. Use this list to tag these nodes as controllers with the osprole=controller property:

heat-admin@controller-1 # for controller in ${controllers}; do sudo pcs property set --node ${controller} osprole=controller ; done

12. Build a list of stonith devices already present in the environment:

heat-admin@controller-1 # stonithdevs=$(sudo pcs stonith | awk '{print $1}')
**heat-admin@controller-1 #**
echo $stonithdevs`

13. Tag the control plane services to make sure they only run on the controllers identified above, skipping any stonith devices listed:

heat-admin@controller-1 # for i in $(sudo cibadmin -Q --xpath //primitive --node-path | tr ' ' '\n' | awk -F "id='" '{print $2}' | awk -F "'" '{print $1}' | uniq); do \
found=0
if [ -n "$stonithdevs" ]; then
for x in $stonithdevs; do
if [ $x = $i ]; then
found=1
fi
done
fi
if [ $found = 0 ]; then
sudo pcs constraint location $i rule resource-discovery=exclusive score=0 osprole eq controller
fi
done

14. Begin to populate the Compute node resources within pacemaker, starting with neutron-openvswitch-agent:

heat-admin@controller-1 # sudo pcs resource create neutron-openvswitch-agent-compute \
systemd:neutron-openvswitch-agent --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location neutron-openvswitch-agent-compute-clone \
rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start neutron-server-clone then \
neutron-openvswitch-agent-compute-clone require-all=false

Then the Compute libvirtd resource:

heat-admin@controller-1 # sudo pcs resource create libvirtd-compute systemd:libvirtd --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location libvirtd-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start neutron-openvswitch-agent-compute-clone then libvirtd-compute-clone
heat-admin@controller-1 # sudo pcs constraint colocation add libvirtd-compute-clone with neutron-openvswitch-agent-compute-clone

Then the openstack-ceilometer-compute resource:

heat-admin@controller-1 # sudo pcs resource create ceilometer-compute systemd:openstack-ceilometer-compute --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location ceilometer-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start openstack-ceilometer-notification-clone then ceilometer-compute-clone require-all=false
heat-admin@controller-1 # sudo pcs constraint order start libvirtd-compute-clone then ceilometer-compute-clone
heat-admin@controller-1 # sudo pcs constraint colocation add ceilometer-compute-clone with libvirtd-compute-clone

Then the nova-compute resource:

heat-admin@controller-1 # . /home/heat-admin/overcloudrc
heat-admin@controller-1 # sudo pcs resource create nova-compute-checkevacuate ocf:openstack:nova-compute-wait auth_url=$OS_AUTH_URL username=$OS_USERNAME password=$OS_PASSWORD tenant_name=$OS_TENANT_NAME domain=localdomain op start timeout=300 --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location nova-compute-checkevacuate-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start openstack-nova-conductor-clone then nova-compute-checkevacuate-clone require-all=false
heat-admin@controller-1 # sudo pcs resource create nova-compute systemd:openstack-nova-compute --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location nova-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start nova-compute-checkevacuate-clone then nova-compute-clone require-all=true
heat-admin@controller-1 # sudo pcs constraint order start nova-compute-clone then nova-evacuate require-all=false
heat-admin@controller-1 # sudo pcs constraint order start libvirtd-compute-clone then nova-compute-clone
heat-admin@controller-1 # sudo pcs constraint colocation add nova-compute-clone with libvirtd-compute-clone

15. Add stonith devices for the Compute nodes. Replace the $IPMILAN_USERNAME and $IPMILAN_PASSWORD values to suit your IPMI device:

heat-admin@controller-1 # sudo pcs stonith create ipmilan-overcloud-compute-0 fence_ipmilan pcmk_host_list=overcloud-compute-0 ipaddr=10.35.160.78 login=$IPMILAN_USERNAME passwd=$IPMILAN_PASSWORD lanplus=1 cipher=1 op monitor interval=60s`

16. Create a seperate fence-nova stonith device:

heat-admin@controller-1 # . overcloudrc
heat-admin@controller-1 # sudo pcs stonith create fence-nova fence_compute \
auth-url=$OS_AUTH_URL \
login=$OS_USERNAME \
passwd=$OS_PASSWORD \
tenant-name=$OS_TENANT_NAME \
domain=localdomain \
record-only=1 --force

17. Make certain the Compute nodes are able to recover after fencing:

heat-admin@controller-1 # sudo pcs property set cluster-recheck-interval=1min

18. Create Compute node resources and set the stonith level 1 to include both the nodes's physical fence device and fence-nova:

heat-admin@controller-1 # sudo pcs resource create overcloud-compute-n ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
heat-admin@controller-1 # sudo pcs property set --node overcloud-compute-n osprole=compute
heat-admin@controller-1 # sudo pcs stonith level add 1 overcloud-compute-0 ipmilan-overcloud-compute-0,fence-nova
heat-admin@controller-1 # sudo pcs stonith

19. Enable the control and Compute plane services:

heat-admin@controller-1 # sudo pcs resource enable openstack-keystone
heat-admin@controller-1 # sudo pcs resource enable neutron-openvswitch-agent-compute
heat-admin@controller-1 # sudo pcs resource enable libvirtd-compute
heat-admin@controller-1 # sudo pcs resource enable ceilometer-compute
heat-admin@controller-1 # sudo pcs resource enable nova-compute-checkevacuate
heat-admin@controller-1 # sudo pcs resource enable nova-compute

20. Allow some time for the environment to settle before cleaning up any failed resources:

heat-admin@controller-1 # sleep 60
heat-admin@controller-1 # sudo pcs resource cleanup
heat-admin@controller-1 # sudo pcs status
heat-admin@controller-1 # sudo property set stonith-enabled=true

Test High Availability

Note: These steps deliberately reboot the Compute node without warning.

1. The following step boots an instance on the overcloud, and then crashes the Compute node:

stack@director # . overcloudrc
stack@director # nova boot --image cirros --flavor 2 test-failover
stack@director # nova list --fields name,status,host
stack@director # . stackrc
stack@director # ssh -lheat-admin compute-n
heat-admin@compute-n # sudo su -
root@compute-n # echo c > /proc/sysrq-trigger

2. A short time later, the instance should be restarted on a working Compute node:

stack@director # nova list --fields name,status,host
stack@director # nova service-list

References

* https://github.com/beekhof/osp-ha-deploy/blob/master/pcmk/compute-managed.scenario
* https://github.com/beekhof/osp-ha-deploy/blob/master/pcmk/controller-managed.scenario

Comments