Chapter 4. Configuring Red Hat High Availability clusters on AWS
This chapter includes information and procedures for configuring a Red Hat High Availability (HA) cluster on Amazon Web Services (AWS) using EC2 instances as cluster nodes. You have a number of options for obtaining the Red Hat Enterprise Linux (RHEL) images you use for your cluster. For information on image options for AWS, see Red Hat Enterprise Linux Image Options on AWS.
This chapter includes prerequisite procedures for setting up your environment for AWS. Once you have set up your environment, you can create and configure EC2 instances.
This chapter also includes procedures specific to the creation of HA clusters, which transform individual nodes into a cluster of HA nodes on AWS. These include procedures for installing the High Availability packages and agents on each cluster node, configuring fencing, and installing AWS network resource agents.
This chapter refers to the Amazon documentation in a number of places. For many procedures, see the referenced Amazon documentation for more information.
Prerequisites
- You need to install the AWS command line interface (CLI). For more information on installing AWS CLI, see Installing the AWS CLI.
- Enable your subscriptions in the Red Hat Cloud Access program. The Red Hat Cloud Access program allows you to move your Red Hat subscriptions from physical or on-premise systems onto AWS with full support from Red Hat.
Additional resources
4.1. Creating the AWS Access Key and AWS Secret Access Key
You need to create an AWS Access Key and AWS Secret Access Key before you install the AWS CLI. The fencing and resource agent APIs use the AWS Access Key and Secret Access Key to connect to each node in the cluster.
Complete the following steps to create these keys.
Prerequisites
Your IAM user account must have Programmatic access. See Setting up the AWS Environment for more information.
Procedure
- Launch the AWS Console.
- Click on your AWS Account ID to display the drop-down menu and select My Security Credentials.
- Click Users.
- Select the user to open the Summary screen.
- Click the Security credentials tab.
- Click Create access key.
-
Download the
.csvfile (or save both keys). You need to enter these keys when creating the fencing device.
4.2. Installing the HA packages and agents
Complete the following steps on all nodes to install the HA packages and agents.
Procedure
Enter the following command to remove the AWS Red Hat Update Infrastructure (RHUI) client. Because you are going to use a Red Hat Cloud Access subscription, you should not use AWS RHUI in addition to your subscription.
$ sudo -i # yum -y remove rh-amazon-rhui-client*
Register the VM with Red Hat.
# subscription-manager register --auto-attach
Disable all repositories.
# subscription-manager repos --disable=*
Enable the RHEL 7 Server and RHEL 7 Server HA repositories.
# subscription-manager repos --enable=rhel-7-server-rpms # subscription-manager repos --enable=rhel-ha-for-rhel-7-server-rpms
Update all packages.
# yum update -y
Reboot if the kernel is updated.
# reboot
Install pcs, pacemaker, fence agent, and resource agent.
# yum -y install pcs pacemaker fence-agents-aws resource-agents
The user
haclusterwas created during thepcsandpacemakerinstallation in the previous step. Create a password forhaclusteron all cluster nodes. Use the same password for all nodes.# passwd hacluster
Add the
high availabilityservice to the RHEL Firewall iffirewalld.serviceis enabled.# firewall-cmd --permanent --add-service=high-availability # firewall-cmd --reload
Start the
pcsservice and enable it to start on boot.# systemctl enable pcsd.service --now
Verification step
Ensure the pcs service is running.
# systemctl is-active pcsd.service
4.3. Creating a cluster
Complete the following steps to create the cluster of nodes.
Procedure
On one of the nodes, enter the following command to authenticate the pcs user
hacluster. Specify the name of each node in the cluster.# pcs host auth _hostname1_ _hostname2_ _hostname3_
Example:
[root@node01 clouduser]# pcs host auth node01 node02 node03 Username: hacluster Password: node01: Authorized node02: Authorized node03: Authorized
Create the cluster.
# pcs cluster setup --name _hostname1_ _hostname2_ _hostname3_
Example:
[root@node01 clouduser]# pcs cluster setup --name newcluster node01 node02 node03 ...omitted Synchronizing pcsd certificates on nodes node01, node02, node03... node02: Success node03: Success node01: Success Restarting pcsd on the nodes in order to reload the certificates... node02: Success node03: Success node01: Success
Verification steps
Enable the cluster.
# pcs cluster enable --all
Start the cluster.
# pcs cluster start --all
Example:
[root@node01 clouduser]# pcs cluster enable --all node02: Cluster Enabled node03: Cluster Enabled node01: Cluster Enabled [root@node01 clouduser]# pcs cluster start --all node02: Starting Cluster... node03: Starting Cluster... node01: Starting Cluster...
4.4. Creating a fencing device
Complete the following steps to configure fencing.
Procedure
Enter the following AWS metadata query to get the Instance ID for each node. You need these IDs to configure the fence device. See Instance Metadata and User Data for additional information.
# echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id)
Example:
[root@ip-10-0-0-48 ~]# echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id) i-07f1ac63af0ec0ac6
Create a fence device. Use the
pcmk_host_mapcommand to map the RHEL host name to the Instance ID. Use the AWS Access Key and AWS Secret Access Key you previously set up in Creating the AWS Access Key and AWS Secret Access Key.# pcs stonith create cluster_fence fence_aws access_key=access-key secret_key=_secret-access-key_ region=_region_ pcmk_host_map="rhel-hostname-1:Instance-ID-1;rhel-hostname-2:Instance-ID-2;rhel-hostname-3:Instance-ID-3"
Example:
[root@ip-10-0-0-48 ~]# pcs stonith create clusterfence fence_aws access_key=AKIAI*******6MRMJA secret_key=a75EYIG4RVL3h*******K7koQ8dzaDyn5yoIZ/ region=us-east-1 pcmk_host_map="ip-10-0-0-48:i-07f1ac63af0ec0ac6;ip-10-0-0-46:i-063fc5fe93b4167b2;ip-10-0-0-58:i-08bd39eb03a6fd2c7" power_timeout=240 pcmk_reboot_timeout=480 pcmk_reboot_retries=4
Verification steps
Test the fencing agent for one of the other nodes.
# pcs stonith fence _awsnodename_
Example:
[root@ip-10-0-0-48 ~]# pcs stonith fence ip-10-0-0-58 Node: ip-10-0-0-58 fenced
Check the status to verify that the node is fenced.
# watch pcs status
Example:
[root@ip-10-0-0-48 ~]# pcs status Cluster name: newcluster Stack: corosync Current DC: ip-10-0-0-46 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum Last updated: Fri Mar 2 20:01:31 2018 Last change: Fri Mar 2 19:24:59 2018 by root via cibadmin on ip-10-0-0-48 3 nodes configured 1 resource configured Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ] Full list of resources: clusterfence (stonith:fence_aws): Started ip-10-0-0-46 Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
4.5. Installing the AWS CLI on cluster nodes
Previously, you installed the AWS CLI on your host system. You now need to install the AWS CLI on cluster nodes before you configure the network resource agents.
Complete the following procedure on each cluster node.
Prerequisites
You must have created an AWS Access Key and AWS Secret Access Key. For more information, see Creating the AWS Access Key and AWS Secret Access Key.
Procedure
- Perform the procedure Installing the AWS CLI.
Enter the following command to verify that the AWS CLI is configured properly. The instance IDs and instance names should display.
Example:
[root@ip-10-0-0-48 ~]# aws ec2 describe-instances --output text --query 'Reservations[*].Instances[*].[InstanceId,Tags[?Key==`Name`].Value]' i-07f1ac63af0ec0ac6 ip-10-0-0-48 i-063fc5fe93b4167b2 ip-10-0-0-46 i-08bd39eb03a6fd2c7 ip-10-0-0-58
4.6. Installing network resource agents
For HA operations to work, the cluster uses AWS networking resource agents to enable failover functionality. If a node does not respond to a heartbeat check in a set time, the node is fenced and operations fail over to an additional node in the cluster. Network resource agents need to be configured for this to work.
Add the two resources to the same group to enforce order and colocation constraints.
Create a secondary private IP resource and virtual IP resource
Complete the following procedure to add a secondary private IP address and create a virtual IP. You can complete this procedure from any node in the cluster.
Procedure
Enter the following command to view the
AWS Secondary Private IP Addressresource agent (awsvip) description. This shows the options and default operations for this agent.# pcs resource describe awsvip
Enter the following command to create the Secondary Private IP address using an unused private IP address in the
VPC CIDRblock.# pcs resource create privip awsvip secondary_private_ip=_Unused-IP-Address_ --group _group-name_
Example:
[root@ip-10-0-0-48 ~]# pcs resource create privip awsvip secondary_private_ip=10.0.0.68 --group networking-group
Create a virtual IP resource. This is a VPC IP address that can be rapidly remapped from the fenced node to the failover node, masking the failure of the fenced node within the subnet.
# pcs resource create vip IPaddr2 ip=_secondary-private-IP_ --group _group-name_
Example:
root@ip-10-0-0-48 ~]# pcs resource create vip IPaddr2 ip=10.0.0.68 --group networking-group
Verification step
Enter the pcs status command to verify that the resources are running.
# pcs status
Example:
[root@ip-10-0-0-48 ~]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: ip-10-0-0-46 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Mar 2 22:34:24 2018
Last change: Fri Mar 2 22:14:58 2018 by root via cibadmin on ip-10-0-0-46
3 nodes configured
3 resources configured
Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]
Full list of resources:
clusterfence (stonith:fence_aws): Started ip-10-0-0-46
Resource Group: networking-group
privip (ocf::heartbeat:awsvip): Started ip-10-0-0-48
vip (ocf::heartbeat:IPaddr2): Started ip-10-0-0-58
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabledCreate an elastic IP address
An elastic IP address is a public IP address that can be rapidly remapped from the fenced node to the failover node, masking the failure of the fenced node.
Note that this is different from the virtual IP resource created earlier. The elastic IP address is used for public-facing Internet connections instead of subnet connections.
-
Add the two resources to the same group that was previously created to enforce
orderandcolocationconstraints. Enter the following AWS CLI command to create an elastic IP address.
[root@ip-10-0-0-48 ~]# aws ec2 allocate-address --domain vpc --output text eipalloc-4c4a2c45 vpc 35.169.153.122
Enter the following command to view the AWS Secondary Elastic IP Address resource agent (awseip) description. This shows the options and default operations for this agent.
# pcs resource describe awseip
Create the Secondary Elastic IP address resource using the allocated IP address created in Step 1.
# pcs resource create elastic awseip elastic_ip=_Elastic-IP-Address_allocation_id=_Elastic-IP-Association-ID_ --group networking-group
Example:
# pcs resource create elastic awseip elastic_ip=35.169.153.122 allocation_id=eipalloc-4c4a2c45 --group networking-group
Verification step
Enter the pcs status command to verify that the resource is running.
# pcs status
Example:
[root@ip-10-0-0-58 ~]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: ip-10-0-0-58 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Mon Mar 5 16:27:55 2018
Last change: Mon Mar 5 15:57:51 2018 by root via cibadmin on ip-10-0-0-46
3 nodes configured
4 resources configured
Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]
Full list of resources:
clusterfence (stonith:fence_aws): Started ip-10-0-0-46
Resource Group: networking-group
privip (ocf::heartbeat:awsvip): Started ip-10-0-0-48
vip (ocf::heartbeat:IPaddr2): Started ip-10-0-0-48
elastic (ocf::heartbeat:awseip): Started ip-10-0-0-48
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabledTest the elastic IP address
Enter the following commands to verify the virtual IP (awsvip) and elastic IP (awseip) resources are working.
Procedure
Launch an SSH session from your local workstation to the elastic IP address previously created.
$ ssh -l ec2-user -i ~/.ssh/<KeyName>.pem elastic-IP
Example:
$ ssh -l ec2-user -i ~/.ssh/cluster-admin.pem 35.169.153.122
- Verify that the host you connected to via SSH is the host associated with the elastic resource created.
4.7. Configuring shared block storage
This section provides an optional procedure for configuring shared block storage for a Red Hat High Availability cluster with Amazon EBS Multi-Attach volumes. The procedure assumes three instances (a three-node cluster) with a 1TB shared disk.
Procedure
Create a shared block volume using the AWS command create-volume.
$ aws ec2 create-volume --availability-zone availability_zone --no-encrypted --size 1024 --volume-type io1 --iops 51200 --multi-attach-enabledFor example, the following command creates a volume in the
us-east-1aavailability zone.$ aws ec2 create-volume --availability-zone us-east-1a --no-encrypted --size 1024 --volume-type io1 --iops 51200 --multi-attach-enabled { "AvailabilityZone": "us-east-1a", "CreateTime": "2020-08-27T19:16:42.000Z", "Encrypted": false, "Size": 1024, "SnapshotId": "", "State": "creating", "VolumeId": "vol-042a5652867304f09", "Iops": 51200, "Tags": [ ], "VolumeType": "io1" }NoteYou need the
VolumeIdin the next step.For each instance in your cluster, attach a shared block volume using the AWS command attach-volume. Use your
<instance_id>and<volume_id>.$ aws ec2 attach-volume --device /dev/xvdd --instance-id instance_id --volume-id volume_id
For example, the following command attaches a shared block volume
vol-042a5652867304f09toinstance i-0eb803361c2c887f2.$ aws ec2 attach-volume --device /dev/xvdd --instance-id i-0eb803361c2c887f2 --volume-id vol-042a5652867304f09 { "AttachTime": "2020-08-27T19:26:16.086Z", "Device": "/dev/xvdd", "InstanceId": "i-0eb803361c2c887f2", "State": "attaching", "VolumeId": "vol-042a5652867304f09" }
Verification steps
For each instance in your cluster, verify that the block device is available by using the SSH command with your instance
<ip_address>.# ssh <ip_address> "hostname ; lsblk -d | grep ' 1T '"
For example, the following command lists details including the host name and block device for the instance IP
198.51.100.3.# ssh 198.51.100.3 "hostname ; lsblk -d | grep ' 1T '" nodea nvme2n1 259:1 0 1T 0 disk
Use the
sshcommand to verify that each instance in your cluster uses the same shared disk.# ssh ip_address "hostname ; lsblk -d | grep ' 1T ' | awk '{print \$1}' | xargs -i udevadm info --query=all --name=/dev/{} | grep '^E: ID_SERIAL='"For example, the following command lists details including the host name and shared disk volume ID for the instance IP address
198.51.100.3.# ssh 198.51.100.3 "hostname ; lsblk -d | grep ' 1T ' | awk '{print \$1}' | xargs -i udevadm info --query=all --name=/dev/{} | grep '^E: ID_SERIAL='" nodea E: ID_SERIAL=Amazon Elastic Block Store_vol0fa5342e7aedf09f7
After you have verified that the shared disk is attached to each instance, you can configure resilient storage for the cluster. For information on configuring resilient storage for a Red Hat High Availability cluster, see Configuring a GFS2 File System in a Cluster. For general information on GFS2 file systems, see Configuring and managing GFS2 file systems.