Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Appendix B. Manual Procedure Automated by Ansible Playbooks

The Ansible-based solution provided by this document is designed to automate a manual procedure for configuring Instance HA in a supported manner. For reference, this appendix provides the steps automated by the solution.

1. Begin by disabling libvirtd and all OpenStack services on the Compute nodes:

heat-admin@compute-n # sudo openstack-service stop
heat-admin@compute-n # sudo openstack-service disable
heat-admin@compute-n # sudo systemctl stop libvirtd
heat-admin@compute-n # sudo systemctl disable libvirtd

2. Create an authentication key for use with pacemaker-remote.
Perform this step on one of the Compute nodes:

heat-admin@compute-1 # sudo mkdir -p /etc/pacemaker/
heat-admin@compute-1 # sudo dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
heat-admin@compute-1 # sudo cp /etc/pacemaker/authkey ./
heat-admin@compute-1 # sudo chown heat-admin:heat-admin authkey

3. Copy this key to the director node, and then to the remaining Compute and Controller nodes:

stack@director # scp authkey heat-admin@node-n:~/
heat-admin@node-n # sudo mkdir -p --mode=0750 /etc/pacemaker
heat-admin@node-n # sudo chgrp haclient /etc/pacemaker
heat-admin@node-n # sudo chown root:haclient /etc/pacemaker/authkey

4. Enable pacemaker-remote on all Compute nodes:

heat-admin@compute-n # sudo systemctl enable pacemaker_remote
heat-admin@compute-n # sudo systemctl start pacemaker_remote

5. Confirm that the required versions of the pacemaker (1.1.12-22.el7_1.4.x86_64) and resource-agents (3.9.5-40.el7_1.5.x86_64) packages are installed on the controller and Compute nodes:

heat-admin@controller-n # sudo rpm -qa | egrep \'(pacemaker|resource-agents)'

6. Apply the following constraint workarounds required for BZ#1257414.

Note

This issue has been addressed in RHSA-2015:1862, and might not be required for your environment.

heat-admin@controller-1 # sudo pcs constraint order start openstack-nova-novncproxy-clone then openstack-nova-api-clone
heat-admin@controller-1 # sudo pcs constraint order start rabbitmq-clone then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order promote galera-master then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order start haproxy-clone then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order start memcached-clone then openstack-keystone-clone
heat-admin@controller-1 # sudo pcs constraint order promote redis-master then start openstack-ceilometer-central-clone require-all=false
heat-admin@controller-1 # sudo pcs resource defaults resource-stickiness=INFINITY

7. Create a NovaEvacuate active/passive resource using the overcloudrc file to provide the auth_url, username, tenant and password values:

stack@director # scp overcloudrc heat-admin@controller-1:~/
heat-admin@controller-1 # . ~/overcloudrc
heat-admin@controller-1 # sudo pcs resource create nova-evacuate ocf:openstack:NovaEvacuate auth_url=$OS_AUTH_URL username=$OS_USERNAME password=$OS_PASSWORD tenant_name=$OS_TENANT_NAME \ 1
1
If you are not using shared storage, include the no_shared_storage=1 option. See Section 2.1, “Exceptions for Shared Storage” for more information.

8. Confirm that nova-evacuate is started after the floating IP resources, and the Image Service (glance), OpenStack Networking (neutron), Compute (nova) services:

heat-admin@controller-1 # for i in $(sudo pcs status | grep IP | awk \'{ print $1 }\'); do sudo pcs constraint order start $i then nova-evacuate ; done
heat-admin@controller-1 # for i in openstack-glance-api-clone neutron-metadata-agent-clone openstack-nova-conductor-clone; do sudo pcs constraint order start $i then nova-evacuate require-all=false ; done

9. Disable all OpenStack resources across the control plane:

heat-admin@controller-1 # sudo pcs resource disable openstack-keystone --wait=540

The timeout used here (--wait=540) is only used as an example. Depending on the time needed to stop the Identity Service (and on the power of your hardware), you can consider increasing the timeout period.

10. Create a list of the current controllers using cibadmin data :

heat-admin@controller-1 # controllers=$(sudo cibadmin -Q -o nodes | grep uname | sed s/.*uname..// | awk -F\" \'{print $1}')
heat-admin@controller-1 # echo $controllers

11. Use this list to tag these nodes as controllers with the osprole=controller property:

heat-admin@controller-1 # for controller in ${controllers}; do sudo pcs property set --node ${controller} osprole=controller ; done

12. Build a list of stonith devices already present in the environment:

heat-admin@controller-1 # stonithdevs=$(sudo pcs stonith | awk \'{print $1}')
heat-admin@controller-1 # echo $stonithdevs

13. Tag the control plane services to make sure they only run on the controllers identified above, skipping any stonith devices listed:

heat-admin@controller-1 # for i in $(sudo cibadmin -Q --xpath //primitive --node-path | tr ' ' \'\n' | awk -F "id=\'" '{print $2}' | awk -F "\'" '{print $1}' | uniq); do
    found=0
    if [ -n "$stonithdevs" ]; then
        for x in $stonithdevs; do
            if [ $x = $i ]; then
                found=1
            fi
	    done
    fi
    if [ $found = 0 ]; then
        sudo pcs constraint location $i rule resource-discovery=exclusive score=0 osprole eq controller
    fi
done

14. Begin to populate the Compute node resources within pacemaker, starting with neutron-openvswitch-agent:

heat-admin@controller-1 # sudo pcs resource create neutron-openvswitch-agent-compute systemd:neutron-openvswitch-agent op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location neutron-openvswitch-agent-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start neutron-server-clone then neutron-openvswitch-agent-compute-clone require-all=false

Then the compute libvirtd resource:

heat-admin@controller-1 # sudo pcs resource create libvirtd-compute systemd:libvirtd op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location libvirtd-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start neutron-openvswitch-agent-compute-clone then libvirtd-compute-clone
heat-admin@controller-1 # sudo pcs constraint colocation add libvirtd-compute-clone with neutron-openvswitch-agent-compute-clone

Then the openstack-ceilometer-compute resource:

heat-admin@controller-1 # sudo pcs resource create ceilometer-compute systemd:openstack-ceilometer-compute op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location ceilometer-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start openstack-ceilometer-notification-clone then ceilometer-compute-clone require-all=false
heat-admin@controller-1 # sudo pcs constraint order start libvirtd-compute-clone then ceilometer-compute-clone
heat-admin@controller-1 # sudo pcs constraint colocation add ceilometer-compute-clone with libvirtd-compute-clone

Then the nova-compute resource:

heat-admin@controller-1 # . /home/heat-admin/overcloudrc
heat-admin@controller-1 # sudo pcs resource create nova-compute-checkevacuate ocf:openstack:nova-compute-wait auth_url=$OS_AUTH_URL username=$OS_USERNAME password=$OS_PASSWORD tenant_name=$OS_TENANT_NAME domain=localdomain op start timeout=300 --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location nova-compute-checkevacuate-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start openstack-nova-conductor-clone then nova-compute-checkevacuate-clone require-all=false
heat-admin@controller-1 # sudo pcs resource create nova-compute systemd:openstack-nova-compute --clone interleave=true --disabled --force
heat-admin@controller-1 # sudo pcs constraint location nova-compute-clone rule resource-discovery=exclusive score=0 osprole eq compute
heat-admin@controller-1 # sudo pcs constraint order start nova-compute-checkevacuate-clone then nova-compute-clone require-all=true
heat-admin@controller-1 # sudo pcs constraint order start nova-compute-clone then nova-evacuate require-all=false
heat-admin@controller-1 # sudo pcs constraint order start libvirtd-compute-clone then nova-compute-clone
heat-admin@controller-1 # sudo pcs constraint colocation add nova-compute-clone with libvirtd-compute-clone

15. Add stonith devices for the Compute nodes. Run the following command for each Compute node:

heat-admin@controller-1 # sudo pcs stonith create ipmilan-overcloud-compute-N  fence_ipmilan pcmk_host_list=overcloud-compute-0 ipaddr=10.35.160.78 login=IPMILANUSER passwd=IPMILANPW lanplus=1 cipher=1 op monitor interval=60s;

Where:

  • N is the identifying number of each compute node (for example, ipmilan-overcloud-compute-1, ipmilan-overcloud-compute-2, and so on).
  • IPMILANUSER and IPMILANPW are the username and password to the IPMI device.

16. Create a seperate fence-nova stonith device:

heat-admin@controller-1 # . overcloudrc
heat-admin@controller-1 # sudo pcs stonith create fence-nova fence_compute \
                                    auth-url=$OS_AUTH_URL \
                                    login=$OS_USERNAME \
                                    passwd=$OS_PASSWORD \
                                    tenant-name=$OS_TENANT_NAME \
                                    record-only=1 --force

17. Make certain the Compute nodes are able to recover after fencing:

heat-admin@controller-1 # sudo pcs property set cluster-recheck-interval=1min

18. Create Compute node resources and set the stonith level 1 to include both the nodes’s physical fence device and fence-nova. Run the following commands for each Compute node:

heat-admin@controller-1 # sudo pcs resource create overcloud-compute-N ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
heat-admin@controller-1 # sudo pcs property set --node overcloud-compute-N osprole=compute
heat-admin@controller-1 # sudo pcs stonith level add 1 overcloud-compute-N ipmilan-overcloud-compute-N,fence-nova
heat-admin@controller-1 # sudo pcs stonith

Replace N with the identifying number of each compute node (for example, overcloud-compute-1, overcloud-compute-2, and so on). Use these identifying numbers to match each compute nodes with the stonith devices created earlier (for example, overcloud-compute-1 and ipmilan-overcloud-compute-1).

19. Enable the control and Compute plane services:

heat-admin@controller-1 # sudo pcs resource enable openstack-keystone
heat-admin@controller-1 # sudo pcs resource enable neutron-openvswitch-agent-compute
heat-admin@controller-1 # sudo pcs resource enable libvirtd-compute
heat-admin@controller-1 # sudo pcs resource enable ceilometer-compute
heat-admin@controller-1 # sudo pcs resource enable nova-compute-checkevacuate
heat-admin@controller-1 # sudo pcs resource enable nova-compute

20. Allow some time for the environment to settle before cleaning up any failed resources:

heat-admin@controller-1 # sleep 60
heat-admin@controller-1 # sudo pcs resource cleanup
heat-admin@controller-1 # sudo pcs status
heat-admin@controller-1 # sudo pcs property set stonith-enabled=true