Chapter 4. Using Pacemaker
In the OpenStack configuration described in Figure 1.1, “OpenStack HA environment deployed through director”, most OpenStack services are running on the three controller nodes. To investigate high availability features of those services, log into any of the controllers as the heat-admin user and look at services controlled by Pacemaker.
Output from the Pacemaker
pcs status command includes general Pacemaker information, virtual IP addresses, services, and other Pacemaker information.
For general information about Pacemaker in Red Hat Enterprise Linux, see Configuring and Managing High Availability Clusters in the Red Hat Enterprise Linux documentation.
4.1. General Pacemaker Information
The following example shows the general Pacemaker information section of the of the
pcs status command output:
$ sudo pcs status Cluster name: tripleo_cluster 1 Stack: corosync Current DC: overcloud-controller-1 (version 2.0.1-4.el8-0eb7991564) - partition with quorum Last updated: Thu Feb 8 14:29:21 2018 Last change: Sat Feb 3 11:37:17 2018 by root via cibadmin on overcloud-controller-2 12 nodes configured 2 37 resources configured 3 Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] 4 GuestOnline: [ galera-bundle-0@overcloud-controller-0 galera-bundle-1@overcloud-controller-1 galera-bundle-2@overcloud-controller-2 rabbitmq-bundle-0@overcloud-controller-0 rabbitmq-bundle-1@overcloud-controller-1 rabbitmq-bundle-2@overcloud-controller-2 redis-bundle-0@overcloud-controller-0 redis-bundle-1@overcloud-controller-1 redis-bundle-2@overcloud-controller-2 ] 5 Full list of resources: [...]
The main sections of the output show the following information about the cluster:
- Name of the cluster.
- Number of nodes that are configured for the cluster.
- Number of resources that are configured for the cluster.
- Names of the controller nodes that are currently online.
- Names of the guest nodes that are currently online. Each guest node consists of a complex Bundle Set resource. For more information about bundle sets, see Section 4.3, “OpenStack Services Configured in Pacemaker”.
4.2. Virtual IP Addresses Configured in Pacemaker
Each IPaddr2 resource sets a virtual IP address that clients use to request access to a service. If the Controller Node that is assigned to that IP address fails, the IP address is reassigned to a different controller.
In this example, you can see each controller node that is currently set to listen to a particular virtual IP address.
ip-10.200.0.6 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-192.168.1.150 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-172.16.0.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 ip-172.16.0.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-172.18.0.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 ip-172.19.0.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2
In the output, each IP address is initially attached to a particular controller. For example,
192.168.1.150 is started on overcloud-controller-0. However, if that controller fails, the IP address is reassigned to other controllers in the cluster.
The following table describes the IP addresses in the example and shows how each address was originally allocated.
Table 4.1. IP address description and allocation source
|IP Address||Description||Allocated From|
| || |
Public IP address
| || |
Controller Virtual IP address
Part of the
| || |
Provides access to OpenStack API services on a controller
| || |
Storage Virtual IP address that provides access to the Glance API and to Swift Proxy services
| || |
Provides access to Redis service on a controller
| || |
Provides access to storage management
You can view details about a specific IP address that is managed by Pacemaker with the
pcs command. For example, you can view timeout information or netmask ID.
The following example shows the output of the
pcs command when you run it on the
ip-192.168.1.150 public IP address.
$ sudo pcs resource show ip-192.168.1.150 Resource: ip-192.168.1.150 (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=192.168.1.150 cidr_netmask=32 Operations: start interval=0s timeout=20s (ip-192.168.1.150-start-timeout-20s) stop interval=0s timeout=20s (ip-192.168.1.150-stop-timeout-20s) monitor interval=10s timeout=20s (ip-192.168.1.150-monitor-interval-10s)
If you are logged into the controller that is currently assigned to listen to the IP address 192.168.1.150, you can run the following commands to make sure that the controller is active and that the services are actively listening to that address:
$ ip addr show vlan100 9: vlan100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether be:ab:aa:37:34:e7 brd ff:ff:ff:ff:ff:ff inet 192.168.1.151/24 brd 192.168.1.255 scope global vlan100 valid_lft forever preferred_lft forever inet 192.168.1.150/32 brd 192.168.1.255 scope global vlan100 valid_lft forever preferred_lft forever $ sudo netstat -tupln | grep "192.168.1.150.*haproxy" tcp 0 0 192.168.1.150:8778 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8042 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:9292 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8080 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:80 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8977 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:6080 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:9696 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8000 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8004 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8774 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:5000 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8776 0.0.0.0:* LISTEN 61029/haproxy tcp 0 0 192.168.1.150:8041 0.0.0.0:* LISTEN 61029/haproxy
ip command output shows that the vlan100 interface is listening to both the
192.168.1.151 IPv4 addresses.
netstat command output shows all processes that are listening to the
192.168.1.150 interface. In addition to the
ntpd process that is listening at port 123, the
haproxy process is the only other one listening specifically to
Processes that are listening to all local addresses, such as
0.0.0.0, are also available through 192.168.1.150. These processes include
ntpd, and so on.
The port numbers in the
netstat command output can help you to identify the specific service that HAProxy is listening for. You can view the /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg file to see which services these port numbers represent.
The following list shows several examples of port numbers and the default services that are assigned to them:
- TCP port 6080: nova_novncproxy
- TCP port 9696: neutron
- TCP port 8000: heat_cfn
- TCP port 80: horizon
- TCP port 8776: cinder
Currently, most services that are defined in the haproxy.cfg file listen to the
192.168.1.150 IP address on all three controllers. However, only the controller-0 node is listening externally to the
192.168.1.150 IP address.
Therefore, if the controller-0 node fails, HAProxy only needs to re-assign 192.168.1.150 to another controller and all other services will already be running on the fallback controller node.
4.3. OpenStack Services Configured in Pacemaker
The majority of the services that are managed by the cluster in Red Hat OpenStack Platform 12 and later are configured as Bundle Set resources, or bundles. These services can be started in the same way on each controller node and are set to always run on each controller.
- A bundle resource handles configuring and replicating the same container on all controller nodes, mapping the necessary storage paths to the container directories, and setting specific attributes related to the resource itself.
- A container can run different kind of resources, from simple systemd based services like haproxy to complex services like Galera, which requires specific resource agents that controls and set the state of the service on the different nodes.
systemctlcommands to manage bundles or containers is not supported. You can use the commands to check the status of the services, but you should use only Pacemaker to perform actions on these services.
- Podman containers that are controlled by Pacemaker have a RestartPolicy set to no by Podman. This is to ensure that Pacemaker and not Podman controls the container start and stop actions.
4.3.1. Simple Bundle Set resources (simple bundles)
A simple Bundle Set resource, or simple bundle, is a set of containers that each include the same Pacemaker services to be deployed across the controller nodes.
The following example shows the bundle settings from the
pcs status command:
Podman container set: haproxy-bundle [192.168.24.1:8787/rhosp15/openstack-haproxy:pcmklatest] haproxy-bundle-podman-0 (ocf::heartbeat:podman): Started overcloud-controller-0 haproxy-bundle-podman-1 (ocf::heartbeat:podman): Started overcloud-controller-1 haproxy-bundle-podman-2 (ocf::heartbeat:podman): Started overcloud-controller-2
For each bundle, you can see the following details:
- The name that Pacemaker assigns to the service
- The reference to the container that is associated with the bundle
- The list of the replicas that are running on the different controllers with their status
184.108.40.206. Simple bundle settings
To see details about a particular bundle service, such as the
haproxy-bundle service, use the
pcs resource show command. For example:
$ sudo pcs resource show haproxy-clone Bundle: haproxy-bundle Podman: image=192.168.24.1:8787/rhosp15/openstack-haproxy:pcmklatest network=host options="--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" replicas=3 run-command="/bin/bash /usr/local/bin/kolla_start" Storage Mapping: options=ro source-dir=/var/lib/kolla/config_files/haproxy.json target-dir=/var/lib/kolla/config_files/config.json (haproxy-cfg-files) options=ro source-dir=/var/lib/config-data/puppet-generated/haproxy/ target-dir=/var/lib/kolla/config_files/src (haproxy-cfg-data) options=ro source-dir=/etc/hosts target-dir=/etc/hosts (haproxy-hosts) options=ro source-dir=/etc/localtime target-dir=/etc/localtime (haproxy-localtime) options=ro source-dir=/etc/pki/ca-trust/extracted target-dir=/etc/pki/ca-trust/extracted (haproxy-pki-extracted) options=ro source-dir=/etc/pki/tls/certs/ca-bundle.crt target-dir=/etc/pki/tls/certs/ca-bundle.crt (haproxy-pki-ca-bundle-crt) options=ro source-dir=/etc/pki/tls/certs/ca-bundle.trust.crt target-dir=/etc/pki/tls/certs/ca-bundle.trust.crt (haproxy-pki-ca-bundle-trust-crt) options=ro source-dir=/etc/pki/tls/cert.pem target-dir=/etc/pki/tls/cert.pem (haproxy-pki-cert) options=rw source-dir=/dev/log target-dir=/dev/log (haproxy-dev-log)
haproxy-bundle example also shows the resource settings for HAProxy. Although HAProxy provides high availability services by load-balancing traffic to selected services, you keep HAProxy itself highly available by configuring it as a Pacemaker bundle service.
From the example output, you see that the bundle configures a Podman container with several specific parameters:
image: Image used by the container, which refers to the local registry of the undercloud.
network: Container network type, which is
"host"in the example.
options: Specific options for the container.
replicas: Number that indicates how many copies of the container should be created in the cluster. Each bundle includes three containers, one for each controller node.
run-command: System command used to spawn the container.
In addition to the Podman container specification, the bundle configuration contains also the Storage Mapping section, in which local path on the host are mapped into the container. Therefore, to check the haproxy configuration from the host, you open the /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg file instead of the /etc/haproxy/haproxy.cfg file.
220.127.116.11. Checking simple bundle status
You can check the status of the bundle with the
podman command to launch a command inside the container:
$ sudo podman exec -it haproxy-bundle-podman-0 ps -efww | grep haproxy* root 7 1 0 06:08 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws haproxy 11 7 0 06:08 ? 00:00:17 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws
The output shows that the process is running inside the container.
You can also check the bundle status directly from the host:
$ ps -ef | grep haproxy* root 17774 17729 0 06:08 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws 42454 17819 17774 0 06:08 ? 00:00:21 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws root 288508 237714 0 07:04 pts/0 00:00:00 grep --color=auto haproxy* [root@controller-0 ~]# ps -ef | grep -e 17774 -e 17819 root 17774 17729 0 06:08 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws 42454 17819 17774 0 06:08 ? 00:00:22 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ws root 301950 237714 0 07:07 pts/0 00:00:00 grep --color=auto -e 17774 -e 17819
You can run the same commands on any bundle to see the current level of activity and details about the commands that the service runs.
4.3.2. Complex Bundle Set resources (complex bundles)
Complex Bundle Set resources, or complex bundles, are Pacemaker services that specify a resource configuration in addition to the basic container configuration, which is also included in simple bundles.
This additional configuration is needed to manage Multi-State resources, which are services that can have different states depending on which controller node they run.
This example shows a list of complex bundles from the output of the
pcs status command:
Podman container set: rabbitmq-bundle [192.168.24.1:8787/rhosp15/openstack-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-2 Podman container set: galera-bundle [192.168.24.1:8787/rhosp15/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master overcloud-controller-0 galera-bundle-1 (ocf::heartbeat:galera): Master overcloud-controller-1 galera-bundle-2 (ocf::heartbeat:galera): Master overcloud-controller-2 Podman container set: redis-bundle [192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Master overcloud-controller-0 redis-bundle-1 (ocf::heartbeat:redis): Slave overcloud-controller-1 redis-bundle-2 (ocf::heartbeat:redis): Slave overcloud-controller-2
In the output, you see that unlike RabbitMQ, the Galera and Redis bundles are run as multi-state resources inside their containers.
For the galera-bundle resource, all three controllers are running as Galera masters. For the redis-bundle resource, the overcloud-controller-0 container is running as the master, while the other two controllers are running as slaves.
This means that the Galera service is running under one set of constraints on all three controllers, while the
redis service might run under different constraints on the master and the slave controllers.
The following example shows the output of the
pcs resource show galera-bundle command:
[...] Bundle: galera-bundle Podman: image=192.168.24.1:8787/rhosp15/openstack-mariadb:pcmklatest masters=3 network=host options="--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" replicas=3 run-command="/bin/bash /usr/local/bin/kolla_start" Network: control-port=3123 Storage Mapping: options=ro source-dir=/var/lib/kolla/config_files/mysql.json target-dir=/var/lib/kolla/config_files/config.json (mysql-cfg-files) options=ro source-dir=/var/lib/config-data/puppet-generated/mysql/ target-dir=/var/lib/kolla/config_files/src (mysql-cfg-data) options=ro source-dir=/etc/hosts target-dir=/etc/hosts (mysql-hosts) options=ro source-dir=/etc/localtime target-dir=/etc/localtime (mysql-localtime) options=rw source-dir=/var/lib/mysql target-dir=/var/lib/mysql (mysql-lib) options=rw source-dir=/var/log/mariadb target-dir=/var/log/mariadb (mysql-log-mariadb) options=rw source-dir=/dev/log target-dir=/dev/log (mysql-dev-log) Resource: galera (class=ocf provider=heartbeat type=galera) Attributes: additional_parameters=--open-files-limit=16384 cluster_host_map=overcloud-controller-0:overcloud-controller-0.internalapi.localdomain;overcloud-controller-1:overcloud-controller-1.internalapi.localdomain;overcloud-controller-2:overcloud-controller-2.internalapi.localdomain enable_creation=true wsrep_cluster_address=gcomm://overcloud-controller-0.internalapi.localdomain,overcloud-controller-1.internalapi.localdomain,overcloud-controller-2.internalapi.localdomain Meta Attrs: container-attribute-target=host master-max=3 ordered=true Operations: demote interval=0s timeout=120 (galera-demote-interval-0s) monitor interval=20 timeout=30 (galera-monitor-interval-20) monitor interval=10 role=Master timeout=30 (galera-monitor-interval-10) monitor interval=30 role=Slave timeout=30 (galera-monitor-interval-30) promote interval=0s on-fail=block timeout=300s (galera-promote-interval-0s) start interval=0s timeout=120 (galera-start-interval-0s) stop interval=0s timeout=120 (galera-stop-interval-0s) [...]
This output shows that unlike in a simple bundle, the galera-bundle resource includes explicit resource configuration, which determines all aspects of the multi-state resource.
Even though a service might be running on multiple controllers at the same time, the controller itself might not be listening at the IP address that is needed to actually reach those services.
For more information about troubleshooting the Galera resource, see Chapter 6, Using Galera.
4.4. Pacemaker Failed Actions
If any of the resources fail in any way, they will be listed under the Failed actions heading of the
pcs status output. In the following example, the openstack-cinder-volume service stopped working on controller-0:
Failed Actions: * openstack-cinder-volume_monitor_60000 on overcloud-controller-0 'not running' (7): call=74, status=complete, exitreason='none', last-rc-change='Wed Dec 14 08:33:14 2016', queued=0ms, exec=0ms
In this case, the systemd service openstack-cinder-volume needs to be re-enabled. In other cases, you need to track down and fix the problem, then clean up the resources. See Section 7.1, “Correcting Resource Problems on Controllers” for details.
4.5. Other Pacemaker Information for Controllers
The last sections of the
pcs status output shows information about your power management fencing (IPMI in this case) and the status of the Pacemaker service itself:
my-ipmilan-for-controller-0 (stonith:fence_ipmilan): Started my-ipmilan-for-controller-0 my-ipmilan-for-controller-1 (stonith:fence_ipmilan): Started my-ipmilan-for-controller-1 my-ipmilan-for-controller-2 (stonith:fence_ipmilan): Started my-ipmilan-for-controller-2 PCSD Status: overcloud-controller-0: Online overcloud-controller-1: Online overcloud-controller-2: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled openstack-cinder-volume (systemd:openstack-cinder-volume): Started overcloud-controller-0 pcsd: active/enabled
The my-ipmilan-for-controller settings show the type of fencing done for each node (stonith:fence_ipmilan) and whether or not the IPMI service is stopped or running. The PCSD Status shows that all three controllers are currently online. The Pacemaker service itself consists of three daemons: corosync, pacemaker, and pcsd. Here, all three services are active and enabled.
4.6. Fencing Hardware
When a controller node fails a health check, the controller acting as the Pacemaker designated coordinator (DC) uses the Pacemaker
stonith service to fence off the offending node. Stonith is an acronym for the term "Shoot the other node in the head". So, the DC basically kicks the node out of the cluster.
To see how your fencing devices are configured by
stonith for your OpenStack Platform HA cluster, run the following command:
$ sudo pcs stonith show --full Resource: my-ipmilan-for-controller-0 (class=stonith type=fence_ipmilan) Attributes: pcmk_host_list=overcloud-controller-0 ipaddr=10.100.0.51 login=admin passwd=abc lanplus=1 cipher=3 Operations: monitor interval=60s (my-ipmilan-for-controller-0-monitor-interval-60s) Resource: my-ipmilan-for-controller-1 (class=stonith type=fence_ipmilan) Attributes: pcmk_host_list=overcloud-controller-1 ipaddr=10.100.0.52 login=admin passwd=abc lanplus=1 cipher=3 Operations: monitor interval=60s (my-ipmilan-for-controller-1-monitor-interval-60s) Resource: my-ipmilan-for-controller-2 (class=stonith type=fence_ipmilan) Attributes: pcmk_host_list=overcloud-controller-2 ipaddr=10.100.0.53 login=admin passwd=abc lanplus=1 cipher=3 Operations: monitor interval=60s (my-ipmilan-for-controller-2-monitor-interval-60s)
show --full listing shows details about the three controller nodes that relate to fencing. The fence device uses IPMI power management (fence_ipmilan) to turn the machines on and off as required. Information about the IPMI interface for each node includes the IP address of the IPMI interface (10.100.0.51), the user name to log in as (admin) and the password to use (abc). You can also see the interval at which each host is monitored (60 seconds).
For more information on fencing with Pacemaker, see "Configuring fencing in a Red Hat High Availability cluster" in the Red Hat Enterprise Linux 8 Configuring and Managing High Availability Clusters guide.