Chapter 29. Multi-site Pacemaker clusters
When a cluster spans more than one site, issues with network connectivity between the sites can lead to split-brain situations. When connectivity drops, there is no way for a node on one site to determine whether a node on another site has failed or is still functioning with a failed site interlink. In addition, it can be problematic to provide high availability services across two sites which are too far apart to keep synchronous. To address these issues, Pacemaker provides full support for the ability to configure high availability clusters that span multiple sites through the use of a Booth cluster ticket manager.
29.1. Overview of Booth cluster ticket manager
The Booth ticket manager is a distributed service that is meant to be run on a different physical network than the networks that connect the cluster nodes at particular sites. It yields another, loose cluster, a Booth formation, that sits on top of the regular clusters at the sites. This aggregated communication layer facilitates consensus-based decision processes for individual Booth tickets.
A Booth ticket is a singleton in the Booth formation and represents a time-sensitive, movable unit of authorization. Resources can be configured to require a certain ticket to run. This can ensure that resources are run at only one site at a time, for which a ticket or tickets have been granted.
You can think of a Booth formation as an overlay cluster consisting of clusters running at different sites, where all the original clusters are independent of each other. It is the Booth service which communicates to the clusters whether they have been granted a ticket, and it is Pacemaker that determines whether to run resources in a cluster based on a Pacemaker ticket constraint. This means that when using the ticket manager, each of the clusters can run its own resources as well as shared resources. For example there can be resources A, B and C running only in one cluster, resources D, E, and F running only in the other cluster, and resources G and H running in either of the two clusters as determined by a ticket. It is also possible to have an additional resource J that could run in either of the two clusters as determined by a separate ticket.
29.2. Configuring multi-site clusters with Pacemaker
You can configure a multi-site configuration that uses the Booth ticket manager with the following procedure.
These example commands use the following arrangement:
Cluster 1 consists of the nodes
- Cluster 1 has a floating IP address assigned to it of 192.168.11.100
Cluster 2 consists of
- Cluster 2 has a floating IP address assigned to it of 192.168.22.100
The arbitrator node is
arbitrator-nodewith an ip address of 192.168.99.100
The name of the Booth ticket that this configuration uses is
These example commands assume that the cluster resources for an Apache service have been configured as part of the resource group
apachegroup for each cluster. It is not required that the resources and resource groups be the same on each cluster to configure a ticket constraint for those resources, since the Pacemaker instance for each cluster is independent, but that is a common failover scenario.
Note that at any time in the configuration procedure you can enter the
pcs booth config command to display the booth configuration for the current node or cluster or the
pcs booth status command to display the current status of booth on the local node.
booth-siteBooth ticket manager package on each node of both clusters.
[root@cluster1-node1 ~]# dnf install -y booth-site [root@cluster1-node2 ~]# dnf install -y booth-site [root@cluster2-node1 ~]# dnf install -y booth-site [root@cluster2-node2 ~]# dnf install -y booth-site
booth-arbitratorpackages on the arbitrator node.
[root@arbitrator-node ~]# dnf install -y pcs booth-core booth-arbitrator
If you are running the
firewallddaemon, execute the following commands on all nodes in both clusters as well as on the arbitrator node to enable the ports that are required by the Red Hat High Availability Add-On.
# firewall-cmd --permanent --add-service=high-availability # firewall-cmd --add-service=high-availability
You may need to modify which ports are open to suit local conditions. For more information on the ports that are required by the Red Hat High-Availability Add-On, see Enabling ports for the High Availability Add-On.
Create a Booth configuration on one node of one cluster. The addresses you specify for each cluster and for the arbitrator must be IP addresses. For each cluster, you specify a floating IP address.
[cluster1-node1 ~] # pcs booth setup sites 192.168.11.100 192.168.22.100 arbitrators 192.168.99.100
This command creates the configuration files
/etc/booth/booth.keyon the node from which it is run.
Create a ticket for the Booth configuration. This is the ticket that you will use to define the resource constraint that will allow resources to run only when this ticket has been granted to the cluster.
This basic failover configuration procedure uses only one ticket, but you can create additional tickets for more complicated scenarios where each ticket is associated with a different resource or resources.
[cluster1-node1 ~] # pcs booth ticket add apacheticket
Synchronize the Booth configuration to all nodes in the current cluster.
[cluster1-node1 ~] # pcs booth sync
From the arbitrator node, pull the Booth configuration to the arbitrator. If you have not previously done so, you must first authenticate
pcsto the node from which you are pulling the configuration.
[arbitrator-node ~] # pcs host auth cluster1-node1 [arbitrator-node ~] # pcs booth pull cluster1-node1
Pull the Booth configuration to the other cluster and synchronize to all the nodes of that cluster. As with the arbitrator node, if you have not previously done so, you must first authenticate
pcsto the node from which you are pulling the configuration.
[cluster2-node1 ~] # pcs host auth cluster1-node1 [cluster2-node1 ~] # pcs booth pull cluster1-node1 [cluster2-node1 ~] # pcs booth sync
Start and enable Booth on the arbitrator.Note
You must not manually start or enable Booth on any of the nodes of the clusters since Booth runs as a Pacemaker resource in those clusters.
[arbitrator-node ~] # pcs booth start [arbitrator-node ~] # pcs booth enable
Configure Booth to run as a cluster resource on both cluster sites. This creates a resource group with
booth-serviceas members of that group.
[cluster1-node1 ~] # pcs booth create ip 192.168.11.100 [cluster2-node1 ~] # pcs booth create ip 192.168.22.100
Add a ticket constraint to the resource group you have defined for each cluster.
[cluster1-node1 ~] # pcs constraint ticket add apacheticket apachegroup [cluster2-node1 ~] # pcs constraint ticket add apacheticket apachegroup
You can enter the following command to display the currently configured ticket constraints.
pcs constraint ticket [show]
Grant the ticket you created for this setup to the first cluster.
Note that it is not necessary to have defined ticket constraints before granting a ticket. Once you have initially granted a ticket to a cluster, then Booth takes over ticket management unless you override this manually with the
pcs booth ticket revokecommand. For information on the
pcs boothadministration commands, see the PCS help screen for the
[cluster1-node1 ~] # pcs booth ticket grant apacheticket
It is possible to add or remove tickets at any time, even after completing this procedure. After adding or removing a ticket, however, you must synchronize the configuration files to the other nodes and clusters as well as to the arbitrator and grant the ticket as is shown in this procedure.
For information on additional Booth administration commands that you can use for cleaning up and removing Booth configuration files, tickets, and resources, see the PCS help screen for the
pcs booth command.