Menu Close
Settings Close

Language and Page Formatting Options

Chapter 4. Configuring Red Hat High Availability clusters on AWS

This chapter includes information and procedures for configuring a Red Hat High Availability (HA) cluster on Amazon Web Services (AWS) using EC2 instances as cluster nodes. You have a number of options for obtaining the Red Hat Enterprise Linux (RHEL) images you use for your cluster. For information on image options for AWS, see Red Hat Enterprise Linux Image Options on AWS.

This chapter includes prerequisite procedures for setting up your environment for AWS. Once you have set up your environment, you can create and configure EC2 instances.

This chapter also includes procedures specific to the creation of HA clusters, which transform individual nodes into a cluster of HA nodes on AWS. These include procedures for installing the High Availability packages and agents on each cluster node, configuring fencing, and installing AWS network resource agents.

This chapter refers to the Amazon documentation in a number of places. For many procedures, see the referenced Amazon documentation for more information.

Prerequisites

  • You need to install the AWS command line interface (CLI). For more information on installing AWS CLI, see Installing the AWS CLI.
  • Enable your subscriptions in the Red Hat Cloud Access program. The Red Hat Cloud Access program allows you to move your Red Hat subscriptions from physical or on-premise systems onto AWS with full support from Red Hat.

4.1. Creating the AWS Access Key and AWS Secret Access Key

You need to create an AWS Access Key and AWS Secret Access Key before you install the AWS CLI. The fencing and resource agent APIs use the AWS Access Key and Secret Access Key to connect to each node in the cluster.

Complete the following steps to create these keys.

Prerequisites

Your IAM user account must have Programmatic access. See Setting up the AWS Environment for more information.

Procedure

  1. Launch the AWS Console.
  2. Click on your AWS Account ID to display the drop-down menu and select My Security Credentials.
  3. Click Users.
  4. Select the user to open the Summary screen.
  5. Click the Security credentials tab.
  6. Click Create access key.
  7. Download the .csv file (or save both keys). You need to enter these keys when creating the fencing device.

4.2. Installing the HA packages and agents

Complete the following steps on all nodes to install the HA packages and agents.

Procedure

  1. Enter the following command to remove the AWS Red Hat Update Infrastructure (RHUI) client. Because you are going to use a Red Hat Cloud Access subscription, you should not use AWS RHUI in addition to your subscription.

    $ sudo -i
    # yum -y remove rh-amazon-rhui-client*
  2. Register the VM with Red Hat.

    # subscription-manager register --auto-attach
  3. Disable all repositories.

    # subscription-manager repos --disable=*
  4. Enable the RHEL 7 Server and RHEL 7 Server HA repositories.

    # subscription-manager repos --enable=rhel-7-server-rpms
    # subscription-manager repos --enable=rhel-ha-for-rhel-7-server-rpms
  5. Update all packages.

    # yum update -y
  6. Reboot if the kernel is updated.

    # reboot
  7. Install pcs, pacemaker, fence agent, and resource agent.

    # yum -y install pcs pacemaker fence-agents-aws resource-agents
  8. The user hacluster was created during the pcs and pacemaker installation in the previous step. Create a password for hacluster on all cluster nodes. Use the same password for all nodes.

    # passwd hacluster
  9. Add the high availability service to the RHEL Firewall if firewalld.service is enabled.

    # firewall-cmd --permanent --add-service=high-availability
    # firewall-cmd --reload
  10. Start the pcs service and enable it to start on boot.

    # systemctl enable pcsd.service --now

Verification step

Ensure the pcs service is running.

# systemctl is-active pcsd.service

4.3. Creating a cluster

Complete the following steps to create the cluster of nodes.

Procedure

  1. On one of the nodes, enter the following command to authenticate the pcs user hacluster. Specify the name of each node in the cluster.

    # pcs host auth  _hostname1_ _hostname2_ _hostname3_

    Example:

    [root@node01 clouduser]# pcs host auth node01 node02 node03
    Username: hacluster
    Password:
    node01: Authorized
    node02: Authorized
    node03: Authorized
  2. Create the cluster.

    # pcs cluster setup --name _hostname1_ _hostname2_ _hostname3_

    Example:

    [root@node01 clouduser]# pcs cluster setup --name newcluster node01 node02 node03
    
    ...omitted
    
    Synchronizing pcsd certificates on nodes node01, node02, node03...
    node02: Success
    node03: Success
    node01: Success
    Restarting pcsd on the nodes in order to reload the certificates...
    node02: Success
    node03: Success
    node01: Success

Verification steps

  1. Enable the cluster.

    # pcs cluster enable --all
  2. Start the cluster.

    # pcs cluster start --all

    Example:

    [root@node01 clouduser]# pcs cluster enable --all
    node02: Cluster Enabled
    node03: Cluster Enabled
    node01: Cluster Enabled
    
    [root@node01 clouduser]# pcs cluster start --all
    node02: Starting Cluster...
    node03: Starting Cluster...
    node01: Starting Cluster...

4.4. Creating a fencing device

Complete the following steps to configure fencing.

Procedure

  1. Enter the following AWS metadata query to get the Instance ID for each node. You need these IDs to configure the fence device. See Instance Metadata and User Data for additional information.

    # echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id)

    Example:

    [root@ip-10-0-0-48 ~]# echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id) i-07f1ac63af0ec0ac6
  2. Create a fence device. Use the pcmk_host_map command to map the RHEL host name to the Instance ID. Use the AWS Access Key and AWS Secret Access Key you previously set up in Creating the AWS Access Key and AWS Secret Access Key.

    # pcs stonith create cluster_fence fence_aws access_key=access-key secret_key=_secret-access-key_ region=_region_ pcmk_host_map="rhel-hostname-1:Instance-ID-1;rhel-hostname-2:Instance-ID-2;rhel-hostname-3:Instance-ID-3"

    Example:

    [root@ip-10-0-0-48 ~]# pcs stonith create clusterfence fence_aws access_key=AKIAI*******6MRMJA secret_key=a75EYIG4RVL3h*******K7koQ8dzaDyn5yoIZ/ region=us-east-1 pcmk_host_map="ip-10-0-0-48:i-07f1ac63af0ec0ac6;ip-10-0-0-46:i-063fc5fe93b4167b2;ip-10-0-0-58:i-08bd39eb03a6fd2c7" power_timeout=240 pcmk_reboot_timeout=480 pcmk_reboot_retries=4

Verification steps

  1. Test the fencing agent for one of the other nodes.

    # pcs stonith fence _awsnodename_

    Example:

    [root@ip-10-0-0-48 ~]# pcs stonith fence ip-10-0-0-58
    Node: ip-10-0-0-58 fenced
  2. Check the status to verify that the node is fenced.

    # watch pcs status

    Example:

    [root@ip-10-0-0-48 ~]# pcs status
    Cluster name: newcluster
    Stack: corosync
    Current DC: ip-10-0-0-46 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
    Last updated: Fri Mar  2 20:01:31 2018
    Last change: Fri Mar  2 19:24:59 2018 by root via cibadmin on ip-10-0-0-48
    
    3 nodes configured
    1 resource configured
    
    Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]
    
    Full list of resources:
    
      clusterfence  (stonith:fence_aws):    Started ip-10-0-0-46
    
    Daemon Status:
      corosync: active/disabled
      pacemaker: active/disabled
      pcsd: active/enabled

4.5. Installing the AWS CLI on cluster nodes

Previously, you installed the AWS CLI on your host system. You now need to install the AWS CLI on cluster nodes before you configure the network resource agents.

Complete the following procedure on each cluster node.

Prerequisites

You must have created an AWS Access Key and AWS Secret Access Key. For more information, see Creating the AWS Access Key and AWS Secret Access Key.

Procedure

  1. Perform the procedure Installing the AWS CLI.
  2. Enter the following command to verify that the AWS CLI is configured properly. The instance IDs and instance names should display.

    Example:

    [root@ip-10-0-0-48 ~]# aws ec2 describe-instances --output text --query 'Reservations[*].Instances[*].[InstanceId,Tags[?Key==`Name`].Value]'
    i-07f1ac63af0ec0ac6
    ip-10-0-0-48
    i-063fc5fe93b4167b2
    ip-10-0-0-46
    i-08bd39eb03a6fd2c7
    ip-10-0-0-58

4.6. Installing network resource agents

For HA operations to work, the cluster uses AWS networking resource agents to enable failover functionality. If a node does not respond to a heartbeat check in a set time, the node is fenced and operations fail over to an additional node in the cluster. Network resource agents need to be configured for this to work.

Add the two resources to the same group to enforce order and colocation constraints.

Create a secondary private IP resource and virtual IP resource

Complete the following procedure to add a secondary private IP address and create a virtual IP. You can complete this procedure from any node in the cluster.

Procedure

  1. Enter the following command to view the AWS Secondary Private IP Address resource agent (awsvip) description. This shows the options and default operations for this agent.

    # pcs resource describe awsvip
  2. Enter the following command to create the Secondary Private IP address using an unused private IP address in the VPC CIDR block.

    # pcs resource create privip awsvip secondary_private_ip=_Unused-IP-Address_ --group _group-name_

    Example:

    [root@ip-10-0-0-48 ~]# pcs resource create privip awsvip secondary_private_ip=10.0.0.68 --group networking-group
  3. Create a virtual IP resource. This is a VPC IP address that can be rapidly remapped from the fenced node to the failover node, masking the failure of the fenced node within the subnet.

    # pcs resource create vip IPaddr2 ip=_secondary-private-IP_ --group _group-name_

    Example:

    root@ip-10-0-0-48 ~]# pcs resource create vip IPaddr2 ip=10.0.0.68 --group networking-group

Verification step

Enter the pcs status command to verify that the resources are running.

# pcs status

Example:

[root@ip-10-0-0-48 ~]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: ip-10-0-0-46 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Mar  2 22:34:24 2018
Last change: Fri Mar  2 22:14:58 2018 by root via cibadmin on ip-10-0-0-46

3 nodes configured
3 resources configured

Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]

Full list of resources:

clusterfence    (stonith:fence_aws):    Started ip-10-0-0-46
 Resource Group: networking-group
     privip (ocf::heartbeat:awsvip):    Started ip-10-0-0-48
     vip    (ocf::heartbeat:IPaddr2):   Started ip-10-0-0-58

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Create an elastic IP address

An elastic IP address is a public IP address that can be rapidly remapped from the fenced node to the failover node, masking the failure of the fenced node.

Note that this is different from the virtual IP resource created earlier. The elastic IP address is used for public-facing Internet connections instead of subnet connections.

  1. Add the two resources to the same group that was previously created to enforce order and colocation constraints.
  2. Enter the following AWS CLI command to create an elastic IP address.

    [root@ip-10-0-0-48 ~]# aws ec2 allocate-address --domain vpc --output text
    eipalloc-4c4a2c45   vpc 35.169.153.122
  3. Enter the following command to view the AWS Secondary Elastic IP Address resource agent (awseip) description. This shows the options and default operations for this agent.

    # pcs resource describe awseip
  4. Create the Secondary Elastic IP address resource using the allocated IP address created in Step 1.

    # pcs resource create elastic awseip elastic_ip=_Elastic-IP-Address_allocation_id=_Elastic-IP-Association-ID_ --group networking-group

    Example:

    # pcs resource create elastic awseip elastic_ip=35.169.153.122 allocation_id=eipalloc-4c4a2c45 --group networking-group

Verification step

Enter the pcs status command to verify that the resource is running.

# pcs status

Example:

[root@ip-10-0-0-58 ~]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: ip-10-0-0-58 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Mon Mar  5 16:27:55 2018
Last change: Mon Mar  5 15:57:51 2018 by root via cibadmin on ip-10-0-0-46

3 nodes configured
4 resources configured

Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]

Full list of resources:

 clusterfence   (stonith:fence_aws):    Started ip-10-0-0-46
 Resource Group: networking-group
     privip (ocf::heartbeat:awsvip):  Started ip-10-0-0-48
     vip    (ocf::heartbeat:IPaddr2):    Started ip-10-0-0-48
     elastic (ocf::heartbeat:awseip):    Started ip-10-0-0-48

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Test the elastic IP address

Enter the following commands to verify the virtual IP (awsvip) and elastic IP (awseip) resources are working.

Procedure

  1. Launch an SSH session from your local workstation to the elastic IP address previously created.

    $ ssh -l ec2-user -i ~/.ssh/<KeyName>.pem elastic-IP

    Example:

    $ ssh -l ec2-user -i ~/.ssh/cluster-admin.pem 35.169.153.122
  2. Verify that the host you connected to via SSH is the host associated with the elastic resource created.

4.7. Configuring shared block storage

This section provides an optional procedure for configuring shared block storage for a Red Hat High Availability cluster with Amazon EBS Multi-Attach volumes. The procedure assumes three instances (a three-node cluster) with a 1TB shared disk.

Procedure

  1. Create a shared block volume using the AWS command create-volume.

    $ aws ec2 create-volume --availability-zone availability_zone --no-encrypted --size 1024 --volume-type io1 --iops 51200 --multi-attach-enabled

    For example, the following command creates a volume in the us-east-1a availability zone.

    $ aws ec2 create-volume --availability-zone us-east-1a --no-encrypted --size 1024 --volume-type io1 --iops 51200 --multi-attach-enabled
    
    {
        "AvailabilityZone": "us-east-1a",
        "CreateTime": "2020-08-27T19:16:42.000Z",
        "Encrypted": false,
        "Size": 1024,
        "SnapshotId": "",
        "State": "creating",
        "VolumeId": "vol-042a5652867304f09",
        "Iops": 51200,
        "Tags": [ ],
        "VolumeType": "io1"
    }
    Note

    You need the VolumeId in the next step.

  2. For each instance in your cluster, attach a shared block volume using the AWS command attach-volume. Use your <instance_id> and <volume_id>.

    $ aws ec2 attach-volume --device /dev/xvdd --instance-id instance_id --volume-id volume_id

    For example, the following command attaches a shared block volume vol-042a5652867304f09 to instance i-0eb803361c2c887f2.

    $ aws ec2 attach-volume --device /dev/xvdd --instance-id i-0eb803361c2c887f2 --volume-id vol-042a5652867304f09
    
    {
        "AttachTime": "2020-08-27T19:26:16.086Z",
        "Device": "/dev/xvdd",
        "InstanceId": "i-0eb803361c2c887f2",
        "State": "attaching",
        "VolumeId": "vol-042a5652867304f09"
    }

Verification steps

  1. For each instance in your cluster, verify that the block device is available by using the SSH command with your instance <ip_address>.

    # ssh <ip_address> "hostname ; lsblk -d | grep ' 1T '"

    For example, the following command lists details including the host name and block device for the instance IP 198.51.100.3.

    # ssh 198.51.100.3 "hostname ; lsblk -d | grep ' 1T '"
    
    nodea
    nvme2n1 259:1    0   1T  0 disk
  2. Use the ssh command to verify that each instance in your cluster uses the same shared disk.

    # ssh ip_address "hostname ; lsblk -d | grep ' 1T ' | awk '{print \$1}' | xargs -i udevadm info --query=all --name=/dev/{} | grep '^E: ID_SERIAL='"

    For example, the following command lists details including the host name and shared disk volume ID for the instance IP address 198.51.100.3.

     # ssh 198.51.100.3 "hostname ; lsblk -d | grep ' 1T ' | awk '{print \$1}' | xargs -i udevadm info --query=all --name=/dev/{} | grep '^E: ID_SERIAL='"
    
    nodea
    E: ID_SERIAL=Amazon Elastic Block Store_vol0fa5342e7aedf09f7

After you have verified that the shared disk is attached to each instance, you can configure resilient storage for the cluster. For information on configuring resilient storage for a Red Hat High Availability cluster, see Configuring a GFS2 File System in a Cluster. For general information on GFS2 file systems, see Configuring and managing GFS2 file systems.