Chapter 56. Creating a Red Hat High-Availability cluster with Pacemaker
The following procedure creates a Red Hat High Availability two-node cluster using the
pcs command line interface.
Configuring the cluster in this example requires that your system include the following components:
2 nodes, which will be used to create the cluster. In this example, the nodes used are
- Network switches for the private network. We recommend but do not require a private network for communication among the cluster nodes and other cluster hardware such as network power switches and Fibre Channel switches.
A fencing device for each node of the cluster. This example uses two ports of the APC power switch with a host name of
56.1. Installing cluster software
This procedure installs the cluster software and configures your system for cluster creation.
On each node in the cluster, install the Red Hat High Availability Add-On software packages along with all available fence agents from the High Availability channel.
yum install pcs pacemaker fence-agents-all
Alternatively, you can install the Red Hat High Availability Add-On software packages along with only the fence agent that you require with the following command.
yum install pcs pacemaker fence-agents-model
The following command displays a list of the available fence agents.
rpm -q -a | grep fencefence-agents-rhevm-4.0.2-3.el7.x86_64 fence-agents-ilo-mp-4.0.2-3.el7.x86_64 fence-agents-ipmilan-4.0.2-3.el7.x86_64 ...Warning
After you install the Red Hat High Availability Add-On packages, you should ensure that your software update preferences are set so that nothing is installed automatically. Installation on a running cluster can cause unexpected behaviors. For more information, see Recommended Practices for Applying Software Updates to a RHEL High Availability or Resilient Storage Cluster.
If you are running the
firewallddaemon, execute the following commands to enable the ports that are required by the Red Hat High Availability Add-On.Note
You can determine whether the
firewallddaemon is installed on your system with the
rpm -q firewalldcommand. If it is installed, you can determine whether it is running with the
firewall-cmd --permanent --add-service=high-availability#
The ideal firewall configuration for cluster components depends on the local environment, where you may need to take into account such considerations as whether the nodes have multiple network interfaces or whether off-host firewalling is present. The example here, which opens the ports that are generally required by a Pacemaker cluster, should be modified to suit local conditions. Enabling ports for the High Availability Add-On shows the ports to enable for the Red Hat High Availability Add-On and provides an explanation for what each port is used for.
In order to use
pcsto configure the cluster and communicate among the nodes, you must set a password on each node for the user ID
hacluster, which is the
pcsadministration account. It is recommended that the password for user
haclusterbe the same on each node.
passwd haclusterChanging password for user hacluster. New password: Retype new password: passwd: all authentication tokens updated successfully.
Before the cluster can be configured, the
pcsddaemon must be started and enabled to start up on boot on each node. This daemon works with the
pcscommand to manage configuration across the nodes in the cluster.
On each node in the cluster, execute the following commands to start the
pcsdservice and to enable
pcsdat system start.
systemctl start pcsd.service#
systemctl enable pcsd.service
56.2. Installing the pcp-zeroconf package (recommended)
When you set up your cluster, it is recommended that you install the
pcp-zeroconf package for the Performance Co-Pilot (PCP) tool. PCP is Red Hat’s recommended resource-monitoring tool for RHEL systems. Installing the
pcp-zeroconf package allows you to have PCP running and collecting performance-monitoring data for the benefit of investigations into fencing, resource failures, and other events that disrupt the cluster.
Cluster deployments where PCP is enabled will need sufficient space available for PCP’s captured data on the file system that contains
/var/log/pcp/. Typical space usage by PCP varies across deployments, but 10Gb is usually sufficient when using the
pcp-zeroconf default settings, and some environments may require less. Monitoring usage in this directory over a 14-day period of typical activity can provide a more accurate usage expectation.
To install the
pcp-zeroconf package, run the following command.
yum install pcp-zeroconf
This package enables
pmcd and sets up data capture at a 10-second interval.
For information on reviewing PCP data, see Why did a RHEL High Availability cluster node reboot - and how can I prevent it from happening again? on the Red Hat Customer Portal.
56.3. Creating a high availability cluster
This procedure creates a Red Hat High Availability Add-On cluster that consists of the nodes
haclusterfor each node in the cluster on the node from which you will be running
The following command authenticates user
z1.example.comfor both of the nodes in a two-node cluster that will consist of
pcs host auth z1.example.com z2.example.comUsername:
haclusterPassword: z1.example.com: Authorized z2.example.com: Authorized
Execute the following command from
z1.example.comto create the two-node cluster
my_clusterthat consists of nodes
z2.example.com. This will propagate the cluster configuration files to both nodes in the cluster. This command includes the
--startoption, which will start the cluster services on both nodes in the cluster.
pcs cluster setup my_cluster --start
Enable the cluster services to run on each node in the cluster when the node is booted.Note
For your particular environment, you may choose to leave the cluster services disabled by skipping this step. This allows you to ensure that if a node goes down, any issues with your cluster or your resources are resolved before the node rejoins the cluster. If you leave the cluster services disabled, you will need to manually start the services when you reboot a node by executing the
pcs cluster startcommand on that node.
pcs cluster enable --all
You can display the current status of the cluster with the
pcs cluster status command. Because there may be a slight delay before the cluster is up and running when you start the cluster services with the
--start option of the
pcs cluster setup command, you should ensure that the cluster is up and running before performing any subsequent actions on the cluster and its configuration.
pcs cluster statusCluster Status: Stack: corosync Current DC: z2.example.com (version 2.0.0-10.el8-b67d8d0de9) - partition with quorum Last updated: Thu Oct 11 16:11:18 2018 Last change: Thu Oct 11 16:11:00 2018 by hacluster via crmd on z2.example.com 2 Nodes configured 0 Resources configured ...
56.4. Creating a high availability cluster with multiple links
You can use the
pcs cluster setup command to create a Red Hat High Availability cluster with multiple links by specifying all of the links for each node.
The format for the command to create a two-node cluster with two links is as follows.
pcs cluster setup cluster_name node1_name addr=node1_link0_address addr=node1_link1_address node2_name addr=node2_link0_address addr=node2_link1_address
When creating a cluster with multiple links, you should take the following into account.
The order of the
addr=addressparameters is important. The first address specified after a node name is for
link0, the second one for
link1, and so forth.
- It is possible to specify up to eight links using the knet transport protocol, which is the default transport protocol.
All nodes must have the same number of
As of RHEL 8.1, it is possible to add, remove, and change links in an existing cluster using the
pcs cluster link add, the
pcs cluster link remove, the
pcs cluster link delete, and the
pcs cluster link updatecommands.
- As with single-link clusters, do not mix IPv4 and IPv6 addresses in one link, although you can have one link running IPv4 and the other running IPv6.
- As with single-link clusters, you can specify addresses as IP addresses or as names as long as the names resolve to IPv4 or IPv6 addresses for which IPv4 and IPv6 addresses are not mixed in one link.
The following example creates a two-node cluster named
my_twolink_cluster with two nodes,
rh80-node1 has two interfaces, IP address 192.168.122.201 as
link0 and 192.168.123.201 as
rh80-node2 has two interfaces, IP address 192.168.122.202 as
link0 and 192.168.123.202 as
pcs cluster setup my_twolink_cluster rh80-node1 addr=192.168.122.201 addr=192.168.123.201 rh80-node2 addr=192.168.122.202 addr=192.168.123.202
For information on adding nodes to an existing cluster with multiple links, see Adding a node to a cluster with multiple links.
For information on changing the links in an existing cluster with multiple links, see Adding and modifying links in an existing cluster.
56.5. Configuring fencing
You must configure a fencing device for each node in the cluster. For information about the fence configuration commands and options, see Configuring fencing in a Red Hat High Availability cluster.
For general information on fencing and its importance in a Red Hat High Availability cluster, see Fencing in a Red Hat High Availability Cluster.
When configuring a fencing device, attention should be given to whether that device shares power with any nodes or devices in the cluster. If a node and its fence device do share power, then the cluster may be at risk of being unable to fence that node if the power to it and its fence device should be lost. Such a cluster should either have redundant power supplies for fence devices and nodes, or redundant fence devices that do not share power. Alternative methods of fencing such as SBD or storage fencing may also bring redundancy in the event of isolated power losses.
This example uses the APC power switch with a host name of
zapc.example.com to fence the nodes, and it uses the
fence_apc_snmp fencing agent. Because both nodes will be fenced by the same fencing agent, you can configure both fencing devices as a single resource, using the
You create a fencing device by configuring the device as a
stonith resource with the
pcs stonith create command. The following command configures a
stonith resource named
myapc that uses the
fence_apc_snmp fencing agent for nodes
pcmk_host_map option maps
z1.example.com to port 1, and
z2.example.com to port 2. The login value and password for the APC device are both
apc. By default, this device will use a monitor interval of sixty seconds for each node.
Note that you can use an IP address when specifying the host name for the nodes.
pcs stonith create myapc fence_apc_snmp\
The following command displays the parameters of an existing STONITH device.
pcs stonith config myapcResource: myapc (class=stonith type=fence_apc_snmp) Attributes: ipaddr=zapc.example.com pcmk_host_map=z1.example.com:1;z2.example.com:2 login=apc passwd=apc Operations: monitor interval=60s (myapc-monitor-interval-60s)
After configuring your fence device, you should test the device. For information on testing a fence device, see Testing a fence device.
Do not test your fence device by disabling the network interface, as this will not properly test fencing.
Once fencing is configured and a cluster has been started, a network restart will trigger fencing for the node which restarts the network even when the timeout is not exceeded. For this reason, do not restart the network service while the cluster service is running because it will trigger unintentional fencing on the node.
56.6. Backing up and restoring a cluster configuration
The following commands back up a cluster configuration in a tar archive and restore the cluster configuration files on all nodes from the backup.
Use the following command to back up the cluster configuration in a tar archive. If you do not specify a file name, the standard output will be used.
pcs config backup filename
pcs config backup command backs up only the cluster configuration itself as configured in the CIB; the configuration of resource daemons is out of the scope of this command. For example if you have configured an Apache resource in the cluster, the resource settings (which are in the CIB) will be backed up, while the Apache daemon settings (as set in`/etc/httpd`) and the files it serves will not be backed up. Similarly, if there is a database resource configured in the cluster, the database itself will not be backed up, while the database resource configuration (CIB) will be.
Use the following command to restore the cluster configuration files on all nodes from the backup. If you do not specify a file name, the standard input will be used. Specifying the
--local option restores only the files on the current node.
pcs config restore [--local] [filename]
56.7. Enabling ports for the High Availability Add-On
The ideal firewall configuration for cluster components depends on the local environment, where you may need to take into account such considerations as whether the nodes have multiple network interfaces or whether off-host firewalling is present.
If you are running the
firewalld daemon, execute the following commands to enable the ports that are required by the Red Hat High Availability Add-On.
firewall-cmd --permanent --add-service=high-availability#
You may need to modify which ports are open to suit local conditions.
You can determine whether the
firewalld daemon is installed on your system with the
rpm -q firewalld command. If the
firewalld daemon is installed, you can determine whether it is running with the
firewall-cmd --state command.
The following table shows the ports to enable for the Red Hat High Availability Add-On and provides an explanation for what the port is used for.
Table 56.1. Ports to Enable for High Availability Add-On
It is crucial to open port 2224 in such a way that
Required on all nodes if the cluster has any Pacemaker Remote nodes
Required on the quorum device host when using a quorum device with
Required on corosync nodes to facilitate communication between nodes. It is crucial to open ports 5404-5412 in such a way that
Required on all nodes if the cluster contains any resources requiring DLM (such as
TCP 9929, UDP 9929
Required to be open on all cluster nodes and booth arbitrator nodes to connections from any of those same nodes when the Booth ticket manager is used to establish a multi-site cluster.