Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Chapter 4. Using Pacemaker

In the OpenStack configuration illustrated in Figure 1.1, “OpenStack HA environment deployed through director”, most OpenStack services are running on the three controller nodes. To investigate high availability features of those services, log into any of the controllers as the heat-admin user and look at services controlled by Pacemaker. Output from the Pacemaker pcs status command includes general Pacemaker information, virtual IP addresses, services, and other Pacemaker information.

4.1. General Pacemaker Information

The first part of the pcs status output displays the name of the cluster, when the cluster most recently changed, the current DC, the number of nodes in the cluster, the number of resource configured in the cluster, and the nodes in the cluster:

$ sudo pcs status
    Cluster name: tripleo_cluster
    Last updated: Mon Oct  5 13:42:37 2015
    Last change: Mon Oct  5 13:03:06 2015
    Stack: corosync
    Current DC: overcloud-controller-1 (2) - partition with quorum
    Version: 1.1.12-a14efad
    3 Nodes configured
    115 Resources configured
    Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

    Full list of resources:
...

The initial output from sudo pcs status indicates that the cluster is named tripleo_cluster and it consists of three nodes (overcloud-controller-0, -1, and -2). All three nodes are currently online.

The number of resources configured to be managed within the cluster named tripleo_cluster can change, depending on how the systems are deployed. For this example, there were 115 resources.

The next part of the output from pcs status tells you exactly which resources have been started (IP addresses, services, and so on) and which controller nodes they are running on. The next several sections show examples of that output.

For more information about Pacemaker, see:

4.2. Virtual IP Addresses Configured in Pacemaker

Each IPaddr2 resource sets a virtual IP address that clients use to request access to a service. If the Controller Node assigned to that IP address goes down, the IP address gets reassigned to a different controller. In this example, you can see each controller (overcloud-controller-0, -1, etc.) that is currently set to listen on a particular virtual IP address.

 ip-192.168.1.150	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-10.200.0.6	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-1
 ip-172.16.0.10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-1
 ip-172.16.0.11	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-172.18.0.10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-2
 ip-172.19.0.10	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-2

Notice that each IP address is initially attached to a particular controller (for example, 192.168.1.150 is started on overcloud-controller-0). However, if that controller goes down, its IP address would be reassigned to other controllers in the cluster. Here are descriptions of the IP addresses just shown and how they were originally allocated:

  • 192.168.1.150: Public IP address (allocated from ExternalAllocationPools in network-environment.yaml)
  • 10.200.0.6: Controller Virtual IP address (part of the dhcp_start and dhcp_end range set to 10.200.0.5-10.200.0.24 in undercloud.conf)
  • 172.16.0.10: IP address providing access to OpenStack API services on a controller (allocated from InternalApiAllocationPools in network-environment.yaml)
  • 172.16.0.11: IP address providing access to Redis service on a controller (allocated from InternalApiAllocationPools in network-environment.yaml)
  • 172.18.0.10: Storage Virtual IP address, providing access to Glance API and Swift Proxy services (allocated from StorageAllocationPools in network-environment.yaml)
  • 172.19.0.10: IP address providing access to storage management (allocated from StorageMgmtAlloctionPools in network-environment.yaml)

You can see details about a particular IPaddr2 addresses set in Pacemaker using the pcs command. For example, to see timeouts and other pertinent information for a particular virtual IP address, type the following for one of the IPaddr2 resources:

$ sudo pcs resource show ip-192.168.1.150
 Resource: ip-192.168.1.150 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=192.168.1.150 cidr_netmask=32
  Operations: start interval=0s timeout=20s (ip-192.168.1.150-start-timeout-20s)
              stop interval=0s timeout=20s (ip-192.168.1.150-stop-timeout-20s)
              monitor interval=10s timeout=20s (ip-192.168.1.150-monitor-interval-10s)

If you are logged into the controller which is currently assigned to listen on address 192.168.1.150, run the following commands to make sure it is active and that there are services actively listening on that address:

$ ip addr show vlan100
  9: vlan100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether be:ab:aa:37:34:e7 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.151/24 brd 192.168.1.255 scope global vlan100
       valid_lft forever preferred_lft forever
    inet 192.168.1.150/32 brd 192.168.1.255 scope global vlan100
       valid_lft forever preferred_lft forever

$ sudo netstat -tupln | grep 192.168.1.150
    tcp  0  0 192.168.1.150:6080      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:9696      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8000      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8003      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8004      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8773      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8774      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:5000      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8776      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8777      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:9292      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:8080      0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:80        0.0.0.0:*  LISTEN      4333/haproxy
    tcp  0  0 192.168.1.150:35357     0.0.0.0:*  LISTEN      4333/haproxy
    udp  0  0 192.168.1.150:123       0.0.0.0:*              459/ntpd
    ...
    tcp  0  0 0.0.0.0:22              0.0.0.0:*  LISTEN      2471/sshd
    tcp  0  0 0.0.0.0:4567            0.0.0.0:*  LISTEN      10064/mysqld
    ...
    udp  0  0 0.0.0.0:51475           0.0.0.0:*              545/dhclient
    udp  0  0 0.0.0.0:123             0.0.0.0:*              459/ntpd
    udp  0  0 0.0.0.0:161             0.0.0.0:*              1633/snmpd
    ...

The ip command shows that the vlan100 interface is listening on both the 192.168.1.150 and 192.168.1.151 IPv4 addresses. In output from the netstat command, you can see all the processes listening on the 192.168.1.150 interface. Besides the ntpd process (listening on port 123), the haproxy process is the only other one listening specifically on 192.168.1.150. Also, keep in mind that processes listening on all local addresses (0.0.0.0) are also available through 192.168.1.150 (sshd, mysqld, dhclient, ntpd and so on).

The port numbers shown in the netstat output help you identify the exact service haproxy is listening for. You could look inside the /etc/haproxy/haproxy.cfg file to see what services those port numbers represent. Here are just a few examples:

  • TCP port 6080: nova_novncproxy
  • TCP port 9696: neutron
  • TCP port 8000: heat_cfn
  • TCP port 8003: heat_cloudwatch
  • TCP port 80: horizon

At this time, there are 14 services in haproxy.cfg listening specifically on 192.168.1.150 on all three controllers. However, only controller-0 is currently actually listening externally on 192.168.1.150. So, if controller-0 goes down, HAProxy only needs to reassign 192.168.1.150 to another controller and all the services will already be running.

4.3. OpenStack Services Configured in Pacemaker

Most services are configured as Clone Set resources (or clones), where they are started the same way on each controller and set to always run on each controller. Services are cloned if they need to be active on multiple nodes. As such, you can only clone services that can be active on multiple nodes simultaneously (ie. cluster-aware services).

Other services are configured as Multi-state resources. Multi-state resources are specialized type of clones: unlike ordinary Clone Set resources, a Multi-state resource can be in either a master or slave state. When an instance is started, it must come up in the slave state. Other than this, the names of either state do not have any special meaning. These states, however, allow clones of the same service to run under different rules or constraints.

Keep in mind that, even though a service may be running on multiple controllers at the same time, the controller itself may not be listening on the IP address needed to actually reach those services.

Clone Set resources (clones)

Here are the clone settings from pcs status:

Clone Set: haproxy-clone [haproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: mongod-clone [mongod]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: memcached-clone [memcached]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-alarm-notifier-clone [openstack-ceilometer-alarm-notifier]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-clone [openstack-heat-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-keystone-clone [openstack-keystone]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-c openstack-cinder-volume
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: delay-clone [delay]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-server-clone [neutron-server]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: httpd-clone [httpd]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-alarm-evaluator-clone [openstack-ceilometer-alarm-evaluator]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 openstack-cinder-volume  (systemd:openstack-cinder-volume):  Started overcloud-controller-0
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor] openstack-cinder-volume
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

For each of the Clone Set resources, you can see the following:

  • The name Pacemaker assigns to the service
  • The actual service name
  • The controllers on which the services are started or stopped

With Clone Set, the service is intended to start the same way on all controllers. To see details for a particular clone service (such as the haproxy service), use the pcs resource show command. For example:

$ sudo pcs resource show haproxy-clone
 Clone: haproxy-clone
  Resource: haproxy (class=systemd type=haproxy)
   Operations: start interval=0s timeout=60s (haproxy-start-timeout-60s)
               monitor interval=60s (haproxy-monitor-interval-60s)
$ sudo systemctl status haproxy
haproxy.service - Cluster Controlled haproxy
   Loaded: loaded (/usr/lib/systemd/system/haproxy.service; disabled)
  Drop-In: /run/systemd/system/haproxy.service.d
           └─50-pacemaker.conf
   Active: active (running) since Tue 2015-10-06 08:58:49 EDT; 1h 52min ago
 Main PID: 4215 (haproxy-systemd)
   CGroup: /system.slice/haproxy.service
           ├─4215 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
           ├─4216 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
           └─4217 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds

The haproxy-clone example displays the resource settings for HAProxy. Although HAProxy provides high availability services by load-balancing traffic to selected services, keeping HAProxy itself highly available is done here by configuring it as a Pacemaker clone service.

From the output, notice that the resource is a systemd service named haproxy. It also has start interval and timeout values as well as monitor intervals. The systemctl status command shows that haproxy is currently active. The actual running processes for the haproxy service are listed at the end of the output. Because the whole command line is shown, you can see the configuration file (haproxy.cfg) and PID file (haproxy.pid) associated with the command.

Run those same commands on any Clone Set resource to see its current level of activity and details about the commands the service runs. Note that systemd services controlled by Pacemaker are set to disabled by systemd, since you want Pacemaker and not the system’s boot process to control when the service comes up or goes down.

For more information about Clone Set resources, see Resource Clones in the High Availability Add-On Reference.

Multi-state resources (master/slave)

The Galera and Redis services are run as Multi-state resources. Here is what the pcs status output looks like for those two types of services:

[...]
Master/Slave Set: galera-master [galera]
     Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Master/Slave Set: redis-master [redis]
     Masters: [ overcloud-controller-2 ]
     Slaves: [ overcloud-controller-0 overcloud-controller-1 ]
[...]

For the galera-master resource, all three controllers are running as Galera masters. For the redis-master resource, overcloud-controller-2 is running as the master, while the other two controllers are running as slaves. This means that at the moment, the galera service is running under one set of constraints on all three controllers, while redis may be subject to different constraints on the master and slave controllers.

For more information about Multi-State resources, see Multi-State Resources: Resources That Have Multiple Modes in the High Availability Add-On Reference.

For more information about troubleshooting the Galera resource, see Chapter 6, Using Galera.

4.4. Pacemaker Failed Actions

If any of the resources fail in any way, they will be listed under the Failed actions heading of the pcs status output. Here is an example where the httpd service stopped working on controller-0:

Failed actions:
    httpd_monitor_60000 on overcloud-controller-0 'not running' (7): call= openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started overcloud-controller-0
190, status=complete, exit-reason='none', last-rc-change='Thu Oct  8 10:12:32 2015', queued=0ms, exec=0ms

In this case, the systemd service httpd just needed to be restarted. In other cases, you need to track down and fix the problem, then clean up the resources. See Section 7.1, “Correcting Resource Problems on Controllers” for details.

4.5. Other Pacemaker Information for Controllers

The last sections of the pcs status output shows information about your power management fencing (IPMI in this case) and the status of the Pacemaker service itself:

 my-ipmilan-for-controller-0	(stonith:fence_ipmilan): Started my-ipmilan-for-controller-0
 my-ipmilan-for-controller-1	(stonith:fence_ipmilan): Started my-ipmilan-for-controller-1
 my-ipmilan-for-controller-2	(stonith:fence_ipmilan): Started my-ipmilan-for-controller-2

PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started overcloud-controller-0

  pcsd: active/enabled

The my-ipmilan-for-controller settings show the type of fencing done for each node (stonith:fence_ipmilan) and whether or not the IPMI service is stopped or running. The PCSD Status shows that all three controllers are currently online. The Pacemaker service itself consists of three daemons: corosync, pacemaker, and pcsd. Here, all three services are active and enabled.

4.6. Fencing Hardware

When a controller node fails a health check, the controller acting as the Pacemaker designated coordinator (DC) uses the Pacemaker stonith service to fence off the offending node. Stonith is an acronym for the term "Shoot the other node in the head". So, the DC basically kicks the node out of the cluster.

To see how your fencing devices are configured by stonith for your OpenStack Platform HA cluster, run the following command:

$ sudo pcs stonith show --full
 Resource: my-ipmilan-for-controller-0 (class=stonith type=fence_ipmilan)
  Attributes: pcmk_host_list=overcloud-controller-0 ipaddr=10.100.0.51 login=admin passwd=abc lanplus=1 cipher=3
  Operations: monitor interval=60s (my-ipmilan-for-controller-0-monitor-interval-60s)
 Resource: my-ipmilan-for-controller-1 (class=stonith type=fence_ipmilan)
  Attributes: pcmk_host_list=overcloud-controller-1 ipaddr=10.100.0.52 login=admin passwd=abc lanplus=1 cipher=3
  Operations: monitor interval=60s (my-ipmilan-for-controller-1-monitor-interval-60s)
 Resource: my-ipmilan-for-controller-2 (class=stonith type=fence_ipmilan)
  Attributes: pcmk_host_list=overcloud-controller-2 ipaddr=10.100.0.53 login=admin passwd=abc lanplus=1 cipher=3
  Operations: monitor interval=60s (my-ipmilan-for-controller-2-monitor-interval-60s)

The show --full listing shows details about the three controller nodes that relate to fencing. The fence device uses IPMI power management (fence_ipmilan) to turn the machines on and off as required. Information about the IPMI interface for each node includes the IP address of the IPMI interface (10.100.0.51), the user name to log in as (admin) and the password to use (abc). You can also see the interval at which each host is monitored (60 seconds).

For more information on fencing with Pacemaker, see "Fencing Configuration" in Red Hat Enterprise Linux 7 High Availability Add-On Administration.