Chapter 4. Database Cluster Setup

Cluster services are used to automatically manage the primary and standby databases. For the reference environment, the Red Hat Enterprise Linux High Availability Add-On is used. Refer to Red Hat Enterprise Linux 7: High Availability Add-on Reference.[4] for additional information.

4.1. Setup

Before setting up the cluster, stop the database on both nodes and turn on the appropriate firewalls:

#systemctl stop $APPLIANCE_PG_SERVICE

#firewall-cmd --permanent --add-service=high-availability

Set the password for the 'hacluster’ user which is created when pacemaker is installed. Be sure the hacluster password is the same on both systems.

#passwd hacluster

Enable the pcsd daemon:

#systemctl enable pcsd

#systemctl start pcsd

Install the json_pure gem which is required by the pcs command:

#gem install json_pure

Run the following commands on each database system. Pick a cluster name that is unique.

#NODE1=NODE1_HOSTNAME

#NODE2=NODE2_HOSTNAME

#CLUSTER_NAME=CLUSTER_NAME

Run this command to setup authorization to pcsd on both nodes. It will prompt for the password:

#pcs cluster auth $NODE1 $NODE2 -u hacluster

On NODE1 (primary):

#pcs cluster setup --local --name $CLUSTER_NAME $NODE1 $NODE2

#pcs cluster setup --start --enable --name $CLUSTER_NAME $NODE1 $NODE2

Verify the cluster configuration with the following commands:

#pcs status

Next, configure fencing to allow a surviving cluster node to forcibly remove a non-responsive node from the cluster.

Note

For the reference environment VMware components are used as the virtualization provider. If using Red Hat technologies, the following specific VMware configuration is not used. Refer to Red Hat Enterprise Linux 7: High Availability Add-on Administration.[5] for additional information on fencing configuration and setup.

Check that both database systems can access the SOAP API of the vCenter they are running on. The vCenter user needs to be able to power virtual machines on and off.

#fence_vmware_soap -o list -a VCENTER_HOSTNAME -l VCENTER_USERNAME -p VCENTER_PASSWORD -z

Add vCenter as a fence device to the cluster. This only needs to be run on the first database system:

#pcs stonith create VCENTER_NAME_stonith fence_vmware_soap ipaddr="VCENTER_HOSTNAME" ssl="1" login='VCENTER_USERNAME' passwd='VCENTER_PASSWORD' pcmk_host_list="$NODE1,$NODE2" pcmk_host_check="static-list"
Note

For virtualization environments it is recommended to configure pcmk_host_map. Refer to the following knowledge base article for configuration: https://access.redhat.com/solutions/701463

On both database systems, verify that fencing works. The following command reboots NODE2 when run on NODE1 .

#pcs stonith fence $NODE2

When it has finished rebooting, fence NODE1 from NODE2 :

#pcs stonith fence $NODE1

After each of the systems comes back, verify that the following command shows both cluster nodes as online, and the fence_vmware_soap agent as started:

#pcs status

Add the virtual IP address that the CloudForms appliances will use to connect to the primary database.

#pcs resource create pgvip ocf:heartbeat:IPaddr2 ip=VIP cidr_netmask=NETWORK_PREFIX `iflabel=pgvip meta target-role=Started`

Next, create the resource for the cluster to run the PostgreSQL database. Due to using a newer PostgreSQL version from the Red Hat Software Collections channel, create two custom scripts on both database systems:

/usr/local/bin/pg_ctl:

#cat >/usr/local/bin/pg_ctl <<'EOF'

#!/bin/bash

scl enable rh-postgresql94 -- pg_ctl "$@"

EOF

/usr/local/bin/psql:

#cat >/usr/local/bin/psql <<'EOF'

#!/bin/bash

scl enable rh-postgresql94 -- psql "$@"

EOF
#chmod 755 /usr/local/bin/{pg_ctl,psql}

On the NODE1 , create the resource for the cluster to manage the PostgreSQL service. Replace REPLICATOR_PASSWORD and REPLICATION_VIP with the actual values.

#pcs resource create postgresql pgsql pgctl=/usr/local/bin/pg_ctl pgdata=/var/opt/rh/rh-postgresql94/lib/pgsql/data/ psql=/usr/local/bin/psql config=/var/opt/rh/rh-postgresql94/lib/pgsql/data/postgresql.conf rep_mode=async repuser=replicator primary_conninfo_opt="password=REPLICATOR_PASSWORD" node_list="$NODE1 $NODE2" restore_command='cp /var/opt/rh/rh-postgresql94/lib/pgsql/wal-archive/%f "%p" master_ip=REPLICATION_VIP tmpdir=/var/opt/rh/rh-postgresql94/lib/pgsql/tmp check_wal_receiver=false restart_on_promote=true op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="4s" on-fail="restart" op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" op promote timeout="60s" interval="0s" on-fail="restart" op demote timeout="60s" interval="0s" on-fail="stop" op stop timeout="60s" interval="0s" op notify timeout="60s" interval="0s"

#pcs resource master postgresql-ms postgresql master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true

Set low timeout values for the monitor operations to speed up the master/slave negotiation when the resource is started.

Configure the cluster to keep the replication and service IP address on the same cluster node as the primary database.

#pcs constraint colocation add pgvip with Master postgresql-ms INFINITY

#pcs constraint order promote postgresql-ms then start pgvip symmetrical=false score=INFINITY

#pcs constraint order demote postgresql-ms then stop pgvip symmetrical=false score=0

Replace the postgresql resource agent file to support the pglogical failover. This new resource agent will run the syncsubs method every 10 minutes as defined in the resource configuration. Syncsubs reads from the pglogical tables for the subscription information and writes it to a target location which is either a local file or remote-by-ssh location. When a failure occurs, the subscription information is already available to the new master and it will read in and recreate the subscription.

#cp pgsql /usr/lib/ocf/resource.d/

Verify PostgreSQL is started:

#pcs status

This should show that the database is running as master on one node, and as slave on the other node. If it shows that the database is stopped on both nodes and provides "failed actions" details, run the following to help diagnose the issue:

#pcs resource debug-start postgresql

One more thing to do is to ensure that old WAL files are deleted. This can be done with the archive_cleanup_command option of the pgsql cluster resource script, however there is desire to keep WAL files longer than they are needed for replication and to be able to reconstruct the database from an older backup.

To do this, create a script in /etc/cron.daily on both nodes. The find command at the end of the script is run as the postgres user as in production, the wal-archive directory is an NFS mount and not readable by the root user.

/etc/cron.daily/pgsql-replication:

#!/bin/bash

# Delete archived WAL files after N days. This assumes that we take a full backup more often than that.

set -eu

# Exit if we are on the standby node. We only need to run the delete command once, and the primary node is the one who # writes the files.

if [ -e /var/opt/rh/rh-postgresql94/lib/pgsql/data/recovery.conf ]; then

exit 0

fi

# The number of days after which archived WAL files will be deleted.

MAX_AGE=3

su -c "find /var/opt/rh/rh-postgresql94/lib/pgsql/wal-archive -maxdepth 1 -mtime +$MAX_AGE -type f -delete" postgres

EOF
chmod +x /etc/cron.daily/pgsql-replication

4.1.1. Operations

To stop all cluster resources on one node:

#pcs cluster standby $HOSTNAME

If that node is the primary database, the remaining node should automatically switch from standby to primary.

To re-enable the node to host cluster resources:

#pcs cluster unstandby $HOSTNAME

The pgsql cluster resource agent uses a lock file to track clean shutdowns of the primary/standby combo and will refuse to make a node the primary if the lock file is there.

This lock file is only deleted when the standby is stopped first, and then the primary is stopped. If the primary is stopped first, or is fenced, and the standby takes over, the lock file is not deleted.

In reference environment, this precaution is not necessary because the Write Ahead Logs are archived. If PostgreSQL remains in a “Stopped” state on a node and the "Failed actions" list shows "unknown error", execute to help diagnose the issue:

#pcs resource debug-start postgresql

If the node displays:

ERROR: My data may be inconsistent.

You have to remove /var/opt/rh/rh-postgresql94/lib/pgsql/tmp/PGSQL.lock file to force start.

…​delete /var/opt/rh/rh-postgresql94/lib/pgsql/tmp/PGSQL.lock and run:

#pcs resource cleanup postgresql