Chapter 4. Configuring a Red Hat High Availability cluster on AWS

This chapter includes information and procedures for configuring a Red Hat High Availability (HA) cluster on Amazon Web Services (AWS) using EC2 instances as cluster nodes. Note that you have a number of options for obtaining the Red Hat Enterprise Linux (RHEL) images you use for your cluster. For information on image options for AWS, see Red Hat Enterprise Linux Image Options on AWS.

The chapter includes prerequisite procedures for setting up your environment for AWS. Once you have set up your environment, you can create and configure EC2 instances.

The chapter also includes procedures specific to the creation of HA clusters, which transform individual nodes into a cluster of HA nodes on AWS. These include procedures for installing the High Availability packages and agents on each cluster node, configuring fencing, and installing AWS network resource agents.

The chapter refers to the Amazon documentation in a number of places. For many procedures, see the referenced Amazon documentation for more information.

Prerequisites

4.1. Creating the AWS Access Key and AWS Secret Access Key

You need to create an AWS Access Key and AWS Secret Access Key before you install the AWS CLI. The fencing and resource agent APIs use the AWS Access Key and Secret Access Key to connect to each node in the cluster.

Complete the following steps to create these keys.

Prerequisites

Your IAM user account must have Programmatic access. For more information see Setting up the AWS Environment.

Procedure

  1. Launch the AWS Console.
  2. Click on your AWS Account ID to display the drop-down menu and select My Security Credentials.
  3. Click Users.
  4. Select the user and open the Summary screen.
  5. Click the Security credentials tab.
  6. Click Create access key.
  7. Download the .csv file (or save both keys). You need to enter these keys when creating the fencing device.

4.2. Installing the AWS CLI

Many of the procedures in this chapter include using the AWS CLI. Complete the following steps to install the AWS CLI.

Prerequisites

You need to have created and have access to an AWS Access Key ID and an AWS Secret Access Key. See Quickly Configuring the AWS CLI for information and instructions.

Procedure

  1. Install Python 3 and the pip tool.

    # yum install python3
    # yum install python3-pip
  2. Install the AWS command line tools with the pip command.

    # pip3 install awscli
  3. Run the aws --version command to verify that you installed the AWS CLI.

    $ aws --version
    aws-cli/1.16.182 Python/2.7.5 Linux/3.10.0-957.21.3.el7.x86_64 botocore/1.12.172
  4. Configure the AWS command line client according to your AWS access details.

    $ aws configure
    AWS Access Key ID [None]:
    AWS Secret Access Key [None]:
    Default region name [None]:
    Default output format [None]:

4.3. Creating an HA EC2 instance

Complete the following steps to create the instances that you then use as your HA cluster nodes. Note that you have a number of options for obtaining the RHEL images you use for your cluster. See Red Hat Enterprise Linux Image Options on AWS for information on image options for AWS.

You can create and upload a custom image that you then use for your cluster nodes, or you could choose a Gold Image (Cloud Access image) or an on-demand image.

Prerequisites

You need to have set up an AWS environment. See Setting Up with Amazon EC2 for more information.

Procedure

  1. From the AWS EC2 Dashboard, select Images and then AMIs.
  2. Right-click on your image and select Launch.
  3. Choose an Instance Type that meets or exceeds the requirements of your workload. Depending on your HA application, each instance may need to have higher capacity.

See Amazon EC2 Instance Types for information on instance types.

  1. Click Next: Configure Instance Details.

    1. Enter the Number of instances you want to create for the cluster. The examples in this chapter use three cluster nodes.

      Note

      Do not launch into an Auto Scaling Group.

    2. For Network, select the VPC you created in Set up the AWS environment. Select the subnet for the instance to create a new subnet.
    3. Select Enable for Auto-assign Public IP. These are the minimum selections you need to make for Configure Instance Details. Depending on your specific HA application, you may need to make additional selections.

      Note

      These are the minimum configuration options necessary to create a basic instance. Review additional options based on your HA application requirements.

  2. Click Next: Add Storage and verify that the default storage is sufficient. You do not need to modify these settings unless your HA application requires other storage options.
  3. Click Next: Add Tags.

    Note

    Tags can help you manage your AWS resources. See Tagging Your Amazon EC2 Resources for information on tagging.

  4. Click Next: Configure Security Group. Select the existing security group you created in Setting up the AWS environment.
  5. Click Review and Launch and verify your selections.
  6. Click Launch. You are prompted to select an existing key pair or create a new key pair. Select the key pair you created when Setting up the AWS environment.
  7. Click Launch Instances.
  8. Click View Instances. You can name the instance(s).

    Note

    Alternatively, you can launch instances using the AWS CLI. See Launching, Listing, and Terminating Amazon EC2 Instances in the Amazon documentation for more information.

4.4. Configuring the private key

Complete the following configuration tasks to use the private ssh key file (.pem) before it can be used in an SSH session.

Procedure

  1. Move the key file from the Downloads directory to your Home directory or to your ~/.ssh directory.
  2. Enter the following command to change the permissions of the key file so that only the root user can read it.

    # chmod 400 KeyName.pem

4.5. Connecting to an instance

Complete the following steps on all nodes to connect to an instance.

Procedure

  1. Launch the AWS Console and select the EC2 instance.
  2. Click Connect and select A standalone SSH client.
  3. From your SSH terminal session, connect to the instance using the AWS example provided in the pop-up window. Add the correct path to your KeyName.pem file if the path is not shown in the example.

4.6. Installing the HA packages and agents

Complete the following steps on all nodes to install the HA packages and agents.

Procedure

  1. Enter the following command to remove the AWS Red Hat Update Infrastructure (RHUI) client. Because you are going to use a Red Hat Cloud Access subscription, you should not use AWS RHUI in addition to your subscription.

    $ sudo -i
    # yum -y remove rh-amazon-rhui-client*
  2. Register the VM with Red Hat.

    # subscription-manager register --auto-attach
  3. Disable all repositories.

    # subscription-manager repos --disable=*
  4. Enable the RHEL 8 Server and RHEL 8 Server HA repositories.

    # subscription-manager repos --enable=rhel-8-server-rpms
    # subscription-manager repos --enable=rhel-ha-for-rhel-8-server-rpms
  5. Update the RHEL AWS instance.

    # yum update -y
  6. Install the Red Hat High Availability Add-On software packages, along with all available fencing agents from the High Availability channel.

    # yum install pcs pacemaker fence-agents-aws
  7. The user hacluster was created during the pcs and pacemaker installation in the previous step. Create a password for hacluster on all cluster nodes. Use the same password for all nodes.

    # passwd hacluster
  8. Add the high availability service to the RHEL Firewall if firewalld.service is installed.

    # firewall-cmd --permanent --add-service=high-availability
    # firewall-cmd --reload
  9. Start the pcs service and enable it to start on boot.

    # systemctl start pcsd.service
    # systemctl enable pcsd.service
  10. Edit /etc/hosts and add RHEL host names and internal IP addresses. See How should the /etc/hosts file be set up on RHEL cluster nodes? for details.

Verification step

Ensure the pcs service is running.

# systemctl status pcsd.service

pcsd.service - PCS GUI and remote configuration interface
Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2018-03-01 14:53:28 UTC; 28min ago
Docs: man:pcsd(8)
man:pcs(8)
Main PID: 5437 (pcsd)
CGroup: /system.slice/pcsd.service
     └─5437 /usr/bin/ruby /usr/lib/pcsd/pcsd > /dev/null &
Mar 01 14:53:27 ip-10-0-0-48.ec2.internal systemd[1]: Starting PCS GUI and remote configuration interface…
Mar 01 14:53:28 ip-10-0-0-48.ec2.internal systemd[1]: Started PCS GUI and remote configuration interface.

4.7. Creating a cluster

Complete the following steps to create the cluster of nodes.

Procedure

  1. On one of the nodes, enter the following command to authenticate the pcs user hacluster. In the command, specify the name of each node in the cluster.

    # pcs host auth  hostname1 hostname2 hostname3
    Username: hacluster
    Password:
    hostname1: Authorized
    hostname2: Authorized
    hostname3: Authorized

    Example:

    [root@node01 clouduser]# pcs host auth node01 node02 node03
    Username: hacluster
    Password:
    node01: Authorized
    node02: Authorized
    node03: Authorized
  2. Create the cluster.

    # pcs cluster setup cluster-name hostname1 hostname2 hostname3

    Example:

    [root@node01 clouduser]# pcs cluster setup --name newcluster node01 node02 node03
    
    ...omitted
    
    Synchronizing pcsd certificates on nodes node01, node02, node03...
    node02: Success
    node03: Success
    node01: Success
    Restarting pcsd on the nodes in order to reload the certificates...
    node02: Success
    node03: Success
    node01: Success

Verification steps

  1. Enable the cluster.

    [root@node01 clouduser]# pcs cluster enable --all
  2. Start the cluster.

    [root@node01 clouduser]# pcs cluster start --all

    Example:

    [root@node01 clouduser]# pcs cluster enable --all
    node02: Cluster Enabled
    node03: Cluster Enabled
    node01: Cluster Enabled
    
    [root@node01 clouduser]# pcs cluster start --all
    node02: Starting Cluster...
    node03: Starting Cluster...
    node01: Starting Cluster...

4.8. Configuring fencing

Complete the following steps to configure fencing.

Procedure

  1. Enter the following AWS metadata query to get the Instance ID for each node. You need these IDs to configure the fence device. See Instance Metadata and User Data for additional information.

    # echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id)

    Example:

    [root@ip-10-0-0-48 ~]# echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id) i-07f1ac63af0ec0ac6
  2. Enter the following command to configure the fence device. Use pcmk_host_map to map the RHEL host name to the Instance ID. Use the AWS Access Key and AWS Secret Access Key you previously set up.

    # pcs stonith create name fence_aws access_key=access-key secret_key=secret-access-key region=region pcmk_host_map="rhel-hostname-1:Instance-ID-1;rhel-hostname-2:Instance-ID-2;rhel-hostname-3:Instance-ID-3" power_timeout=240 pcmk_reboot_timeout=480 pcmk_reboot_retries=4

    Example:

    [root@ip-10-0-0-48 ~]# pcs stonith create clusterfence fence_aws access_key=AKIAI*******6MRMJA secret_key=a75EYIG4RVL3h*******K7koQ8dzaDyn5yoIZ/ region=us-east-1 pcmk_host_map="ip-10-0-0-48:i-07f1ac63af0ec0ac6;ip-10-0-0-46:i-063fc5fe93b4167b2;ip-10-0-0-58:i-08bd39eb03a6fd2c7" power_timeout=240 pcmk_reboot_timeout=480 pcmk_reboot_retries=4
  3. Test the fencing agent for one of the other nodes.

    # pcs stonith fence awsnodename
    Note

    The command response may take several minutes to display. If you watch the active terminal session for the node being fenced, you see that the terminal connection is immediately terminated after you enter the fence command.

    Example:

    [root@ip-10-0-0-48 ~]# pcs stonith fence ip-10-0-0-58
    Node: ip-10-0-0-58 fenced

Verification steps

  1. Check the status to verify that the node is fenced.

    # pcs status

    Example:

    [root@ip-10-0-0-48 ~]# pcs status
    Cluster name: newcluster
    Stack: corosync
    Current DC: ip-10-0-0-46 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
    Last updated: Fri Mar  2 19:55:41 2018
    Last change: Fri Mar  2 19:24:59 2018 by root via cibadmin on ip-10-0-0-46
    
    3 nodes configured
    1 resource configured
    
    Online: [ ip-10-0-0-46 ip-10-0-0-48 ]
    OFFLINE: [ ip-10-0-0-58 ]
    
    Full list of resources:
    clusterfence  (stonith:fence_aws):    Started ip-10-0-0-46
    
    Daemon Status:
    corosync: active/disabled
    pacemaker: active/disabled
    pcsd: active/enabled
  2. Start the node that was fenced in the previous step.

    # pcs cluster start awshostname
  3. Check the status to verify the node started.

    # pcs status

    Example:

    [root@ip-10-0-0-48 ~]# pcs status
    Cluster name: newcluster
    Stack: corosync
    Current DC: ip-10-0-0-46 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
    Last updated: Fri Mar  2 20:01:31 2018
    Last change: Fri Mar  2 19:24:59 2018 by root via cibadmin on ip-10-0-0-48
    
    3 nodes configured
    1 resource configured
    
    Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]
    
    Full list of resources:
    
      clusterfence  (stonith:fence_aws):    Started ip-10-0-0-46
    
    Daemon Status:
      corosync: active/disabled
      pacemaker: active/disabled
      pcsd: active/enabled

4.9. Installing the AWS CLI on cluster nodes

Previously, you installed the AWS CLI on your host system. You now need to install the AWS CLI on cluster nodes before you configure the network resource agents.

Complete the following procedure on each cluster node.

Prerequisites

You must have created an AWS Access Key and AWS Secret Access Key. See Creating the AWS Access Key and AWS Secret Access Key for more information.

Procedure

  1. Perform the procedure Installing the AWS CLI.
  2. Enter the following command to verify that the AWS CLI is configured properly. The instance IDs and instance names should display.

    Example:

    [root@ip-10-0-0-48 ~]# aws ec2 describe-instances --output text --query 'Reservations[*].Instances[*].[InstanceId,Tags[?Key==`Name`].Value]'
    i-07f1ac63af0ec0ac6
    ip-10-0-0-48
    i-063fc5fe93b4167b2
    ip-10-0-0-46
    i-08bd39eb03a6fd2c7
    ip-10-0-0-58

4.10. Installing network resource agents

For HA operations to work, the cluster uses AWS networking resource agents to enable failover functionality. If a node does not respond to a heartbeat check in a set time, the node is fenced and operations fail over to an additional node in the cluster. Network resource agents need to be configured for this to work.

Add the two resources to the same group to enforce order and colocation constraints.

Create a secondary private IP resource and virtual IP resource

Complete the following procedure to add a secondary private IP address and create a virtual IP. You can complete this procedure from any node in the cluster.

Procedure

  1. Enter the following command to view the AWS Secondary Private IP Address resource agent (awsvip) description. This shows the options and default operations for this agent.

    # pcs resource describe awsvip
  2. Enter the following command to create the Secondary Private IP address using an unused private IP address in the VPC CIDR block.

    # pcs resource create privip awsvip secondary_private_ip=Unused-IP-Address --group group-name

    Example:

    [root@ip-10-0-0-48 ~]# pcs resource create privip awsvip secondary_private_ip=10.0.0.68 --group networking-group
  3. Create a virtual IP resource. This is a VPC IP address that can be rapidly remapped from the fenced node to the failover node, masking the failure of the fenced node within the subnet.

    # pcs resource create vip IPaddr2 ip=secondary-private-IP --group group-name

    Example:

    root@ip-10-0-0-48 ~]# pcs resource create vip IPaddr2 ip=10.0.0.68 --group networking-group

Verification step

Enter the pcs status command to verify that the resources are running.

# pcs status

Example:

[root@ip-10-0-0-48 ~]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: ip-10-0-0-46 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Mar  2 22:34:24 2018
Last change: Fri Mar  2 22:14:58 2018 by root via cibadmin on ip-10-0-0-46

3 nodes configured
3 resources configured

Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]

Full list of resources:

clusterfence    (stonith:fence_aws):    Started ip-10-0-0-46
 Resource Group: networking-group
     privip (ocf::heartbeat:awsvip):    Started ip-10-0-0-48
     vip    (ocf::heartbeat:IPaddr2):   Started ip-10-0-0-58

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Create an elastic IP address

An elastic IP address is a public IP address that can be rapidly remapped from the fenced node to the failover node, masking the failure of the fenced node.

Note that this is different from the virtual IP resource created earlier. The elastic IP address is used for public-facing Internet connections instead of subnet connections.

  1. Add the two resources to the same group that was previously created to enforce order and colocation constraints.
  2. Enter the following AWS CLI command to create an elastic IP address.

    [root@ip-10-0-0-48 ~]# aws ec2 allocate-address --domain vpc --output text
    eipalloc-4c4a2c45   vpc 35.169.153.122
  3. Enter the following command to view the AWS Secondary Elastic IP Address resource agent (awseip) description. This shows the options and default operations for this agent.

    # pcs resource describe awseip
  4. Create the Secondary Elastic IP address resource using the allocated IP address created in Step 1.

    # pcs resource create elastic awseip elastic_ip=_Elastic-IP-Address_allocation_id=_Elastic-IP-Association-ID_ --group networking-group

    Example:

    # pcs resource create elastic awseip elastic_ip=35.169.153.122 allocation_id=eipalloc-4c4a2c45 --group networking-group

Verification step

Enter the pcs status command to verify that the resource is running.

# pcs status

Example:

[root@ip-10-0-0-58 ~]# pcs status
Cluster name: newcluster
Stack: corosync
Current DC: ip-10-0-0-58 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Mon Mar  5 16:27:55 2018
Last change: Mon Mar  5 15:57:51 2018 by root via cibadmin on ip-10-0-0-46

3 nodes configured
4 resources configured

Online: [ ip-10-0-0-46 ip-10-0-0-48 ip-10-0-0-58 ]

Full list of resources:

 clusterfence   (stonith:fence_aws):    Started ip-10-0-0-46
 Resource Group: networking-group
     privip (ocf::heartbeat:awsvip):  Started ip-10-0-0-48
     vip    (ocf::heartbeat:IPaddr2):    Started ip-10-0-0-48
     elastic (ocf::heartbeat:awseip):    Started ip-10-0-0-48

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Test the elastic IP address

Enter the following commands to verify the virtual IP (awsvip) and elastic IP (awseip) resources are working.

Procedure

  1. Launch an SSH session from your local workstation to the elastic IP address previously created.

    $ ssh -l ec2-user -i ~/.ssh/<KeyName>.pem elastic-IP

    Example:

    $ ssh -l ec2-user -i ~/.ssh/cluster-admin.pem 35.169.153.122
  2. Verify that the host you connected to via SSH is the host associated with the elastic resource created.