Automating SAP HANA Multi Target System Replication in a Pacemaker-based cluster on Red Hat Enterprise Linux (RHEL)

Updated -

Contents

1. Overview

This document describes how to configure Multi Target Replication (MTR) in a pacemaker-based automated SAP HANA system replication cluster on Red Hat Enterprise Linux (RHEL).

The basis of the setup is an Automated HANA System Replication cluster with 2 sites in a Scale-Out environment. See Red Hat Enterprise Linux HA Solution for SAP HANA Scale-Out and System Replication for more information.

2 secondary HANA instances are required.
All instances require the same amount of the following:

  • RAM
  • CPUs
  • nodes

Scale-Out requires multiple nodes per HANA instance. See SAP HANA Administration Guide for SAP HANA Platform for more information.

Additionally, 3 sites are required:

  • SITE 1 or DC1
  • SITE 2 or DC2
  • SITE 3 or DC3

The initial setup is as follows:

  • Replicate Primary Site 1 (DC1) to Secondary Site 2 (DC2)
  • Replicate Primary Site 1 (DC1) to Secondary Site 3 (DC3)

If the primary fails, the Secondary Site 2 (DC2) automatically becomes the new primary for Site 3 (DC3).

When failover occurs this solution ensures that the primary is switched at the 3rd site as well. The configuration after failover is as follows:

Assuming Site 1 ( DC1) fails:

  • Replicate the new Primary Site 2(DC1) to Secondary Site 3(DC3)

If the Primary falls back to Site 1 (DC1), Site 3 (DC3) needs to be re-registered to Site 1(DC1) again.

2. Supported Scenarios

For more information about supported scenarios refer to Support Policies for RHEL High Availability Clusters - Management of SAP HANA in a Cluster.

3. Parameters

Parameter Example Description
SID MTR System ID of the HANA Database
1st SITE DC1 Name of the 1st site where the 2nd secondary is running
2nd SITE DC2 Name of the 2nd site where the 2nd secondary is running
3rd SITE DC3 Name of the 3rd site where the 2nd secondary is running
InstanceNr 00 HANA Instance Number

4. Preconditions

To support MultiTargetReplication you will need to install the following:

  • SAP HANA SPS04 or later
  • resource-agents-sap-hana-scaleout version 180 or later.

The parameter register_secondaries_on_takeover is available for HANA 2 SPS04 and later versions.
This enables automatic registration of site 3 (DC3) to the new primary, should failover occur in the primary HANA server. It is based on Red Hat Enterprise Linux HA Solution for SAP HANA Scale Out and System Replication

The system replication of all the nodes are based on SAP requirements. For more information, refer to the guidelines from SAP based on the SAP HANA Administration Guide.
The nodes of the additional 3rd location (site) will then be added to the cluster.

5. Installation

The steps below describe how to install a 3-site HANA multi target replication cluster.

If your preconfiguration is a 2 site cluster setup, some steps will not be needed.

5.1. Overview

  • This section describes how to set up a 3 site cluster for a primary and two secondary SAP HANA instances (on different availability zones), or upgrade an existing 2 node cluster. For more information see 6. Upgrading to 3 site MTR.

  • Install HANA using the hdblcm utility

  • Copy the database keys from the primary server to the secondary database servers
  • Register the secondary HANA instance
  • Check the HANA replication status with python systemReplicationStatus.py or hdbnsutil -sr_state
  • Add register_secondaries_on_takeover=true in global.ini on primary and secondary instances
  • Edit /etc/sudoers.d/20-saphana
  • Add the nodes of the third site to the cluster, including corosync and fencing
  • Add constraints for the 3rd site nodes for the resources SAPHanaController and SAPHanaTopology
  • Verify that the installation is successful by running the necessary tests

5.2. Checking base setup

Note: Please ensure that you are using the correct resource-agents-sap-hana-scaleout package on all of the cluster nodes.

While creating the cluster you can either include the additional nodes for the third site during this step, or add them later to your configuration.

The steps for adding the new nodes are as follows:

  • install cluster packages
  • pcs pacemaker fence-agents
  • systemctl start pcsd.service
  • passwd hacluster
  • pcs host auth <nodename>

With this you can do the following:

  • pcs cluster setup .. # create a cluster
  • pcs cluster node add .. # add nodes to an existing cluster

At least 2 internal networks are recommended to be configured between the nodes.

See Red Hat Enterprise Linux HA Solution for SAP HANA Scale Out and System Replication for further details on how to configure these resource packages.

5.3. HANA Installation

The HANA installation should be completed before creating the SAPHana resources.
If you have already installed HANA on the primary node, you can continue to install HANA on the new secondary node using:

  • SID
  • InstanceNumber
  • sidadm user id

The installation of the 3rd site is similar to the installation of the 1st secondary site.
To disable control of the SAP resource agents, you can stop the cluster with the following commands:

  • pcs cluster stop
  • pcs cluster stop --all

or stop all resources

  • pcs resource # This will list all Resources
  • pcs resource disable <resourcename> # This will stop (disable) the resources

If the filesystems are controlled by the cluster, the filesystems will need to be manually mounted.

To install HANA follow the steps below:

  • Check if the cluster is stopped
  • Check the mountpoints
  • Install HANA on the 3rd site
  • Copy the key files from that of the primary
  • Register the secondary HANA node on DC3

Example output of HANA Installation:

DC1/DC2# pcs cluster stop --all
DC3# df
DC3# cd /sapcd/DATA_UNITS/HDB_SERVER_LINUX_X86_64
root@DC3:/sapcd252/DATA_UNITS/HDB_SERVER_LINUX_X86_64]# ./hdbuninst
# Option 0 will remove an already existing HANA Installation
# No SAP HANA Installation found is the expected answer
root@DC3:/sapcd252/DATA_UNITS/HDB_SERVER_LINUX_X86_64]# ./hdblcm
1 install
2 server
/hana/shared is default directory
Enger Local Hostname [dc3host]: use the default name
additional hosts only during Scale-Out Installation y default is n
ENTER SAP HANA System ID: MTR
Enter Instance Number [00]:
Enter Local Host Worker Group [default]:
Select System Usage / Enter Index [4]:
Choose encryption
Enter Location of Data Volumes [/hana/data/MTR]:
Enter Location of Log Volumes [/hana/log/MTR]:
Restrict maximum memory allocation? [n]:
Enter Certificate Host Name
Enter System Administrator (rmtadm) Password: <Y0urPasswd>
Confirm System Administrator (rmtadm) Password: <Y0urPasswd>
Enter System Administrator Home Directory [/usr/sap/RMT/home]:
Enter System Administrator Login Shell [/bin/sh]:
Enter System Administrator User ID [1000]:
Enter System Database User (SYSTEM) Password: <Y0urPasswd>
Confirm System Database User (SYSTEM) Password: <Y0urPasswd>
Restart system after machine reboot? [n]:

Before the installation starts a summary is listed:

SAP HANA Database System Installation
   Installation Parameters
      Remote Execution: ssh
      Database Isolation: low
      Install Execution Mode: standard
      Installation Path: /hana/shared
      Local Host Name: dc3host
      SAP HANA System ID: MTR
      Instance Number: 00
      Local Host Worker Group: default
      System Usage: custom
      Location of Data Volumes: /hana/data/RMT
      Location of Log Volumes: /hana/log/RMT
      SAP HANA Database secure store: ssfs
      Certificate Host Names: dc3host -> dc3host
      System Administrator Home Directory: /usr/sap/RMT/home
      System Administrator Login Shell: /bin/sh
      System Administrator User ID: 1000
      ID of User Group (sapsys): 1010
   Software Components
      SAP HANA Database
         Install version 2.00.052.00.1599235305
         Location: /sapcd252/DATA_UNITS/HDB_SERVER_LINUX_X86_64/server
      SAP HANA Local Secure Store
         Do not install
      SAP HANA AFL (incl.PAL,BFL,OFL)
         Do not install
      SAP HANA EML AFL
         Do not install
      SAP HANA EPM-MDS
         Do not install
      SAP HANA Database Client
         Do not install
      SAP HANA Studio
         Do not install
      SAP HANA Smart Data Access
         Do not install
      SAP HANA XS Advanced Runtime
         Do not install
   Log File Locations
      Log directory: /var/tmp/hdb_RMT_hdblcm_install_2021-06-09_18.48.13
      Trace location: /var/tmp/hdblcm_2021-06-09_18.48.13_31307.trc

Do you want to continue? (y/n):

The command y will start the installation.

If your Scale-Out architecture is running a HANA instance with multiple nodes, you will need to use the same mount point /hana/shared per site. You can either install the other HANA nodes of a site by adding the hosts during the installation, or install the other HANA nodes once with the command /hana/shared/MTR/hdblcm/hdblcm.
For more information on how to install SAP HANA Scale-Out nodes, refer to SAP HANA Administration Guide for SAP HANA Platform.

5.4. Registering the 3rd HANA instance as a new additional secondary

This step is similar to the registration of the first secondary HANA instance (DC2).

To check if the HANA instance on the primary is up and running run the command sapcontrol -nr $TINSTANCE -function GetProcessList; echo $?

If the return code is 3, this indicates that the HANA instance is successfully running on the node where the command was started.

To check if HANA System Replication is enabled on the primary system, run the following command:

> hdbnsutil -sr_state | grep ^mode:
mode: primary

To copy keys from the primary to the secondary site run the following command (in this example, we will assume that SID=MTR). Note that in scale out environments only one copy is necessary.

# scp -rp /usr/sap/MTR/SYS/global/security/rsecssfs/data/SSFS_RH1.DAT  dc3hana01:/usr/sap/MTR/SYS/global/security/rsecssfs/data/SSFS_RH1.DAT
# scp -rp /usr/sap/MTR/SYS/global/security/rsecssfs/key/SSFS_RH1.KEY dc3hana01:/usr/sap/MTR/SYS/global/security/rsecssfs/key/SSFS_RH1.KEY

Register the HANA instance on the 3rd site as an additional secondary sid admin user with the following command:

mtradm@dc3hana01: hdbnsutil -sr_register --name=DC3 --remoteHost=dc1hana01 --remoteInstance=00 --replicationMode=sync --operationMode=logreplay --online

At this point the HANA instance should be up and running. If the instance is running and you don't want it to stop, you can use the option --online, which will register the instance while it is online. The necessary restart (stop and start) of the instance will then be initiated.This will enable an automatic restart of the HANA instance.

Note: The --online command will work on both offline and online databases.

5.5. Adding the node(s) of the third site to the cluster (optional)

Note: This section can be skipped if the site 3 nodes are already part of the cluster.

This chapter describes the necessary steps to add 3rd site nodes to an existing 2 site cluster.
If all of the HANA nodes have been installed, the new site can be integrated into the existing cluster. It is recommended to stop the existing SAP HANA resources first by running the following commands:

dc1hana01# pcs resource disable SAPHanaTopology_MTR_00-clone
dc1hana01# pcs resource disable SAPHana_MTR_00-clone

1. Add the DC3 nodes to the cluster and install the same software packages as the ones installed on the primary node:

Example:

yum install pacemaker pcs fence-agent resource-agents-sap-hana

2. Enable cluster services pcs, pacemaker and corosync:

systemctl enable pcs;systemctl start pcs

3. Set the password of the hacluster user on all new nodes:

passwd hacluster

4. Authorize the new nodes:

pcs host auth dc3hana01
pcs host auth dc3hana02

5. Add the new nodes to the existing cluster:

pcs cluster node add dc3hana01
pcs cluster node add dc3hana02

If you have several networks, the add command syntax is similar to the create command.

5.6. Checking the cluster node(s) on the 3rd site

If the cluster has been started on all nodes you can check if all the nodes are online with
pcs status or pcs status --full

You can start the nodes with the following commands:

pcs cluster start or
pcs cluster start --all.

You can also enable the cluster with
pcs cluster enable [--all].

The SAP HANA resource should be disabled until the constraints are defined. This can be checked with pcs resource. The resources can be disabled with the following command:

pcs resource disable rsc_SAPHana_MTR_HDB00
pcs resource disable rsc_SAPHanaTopology_MTR_HDB00

5.7. Adding MTR support

The 3rd site must be automatically switched to the new primary when a failover occurs.
HANA 2.0 SPS04 provides a system replication option register_secondaries_on_takeover = true, which enforces the attached side to reregister to the new primary.

5.7.1 Configuring Global_ini

This option needs to be added to the global.ini parameter in the HANA instances on site 1 and site 2, which are managed by the pacemaker cluster.

Note: The global.ini file should only be edited if the HANA instance of a site has stopped processing.

The global.ini file can edited by the sidadm (mtradm) user:

vim /usr/sap/${SAPSYSTEMNAME}/SYS/global/hdb/custom/config/global.ini

Example of global.ini:

# global.ini last modified 2021-08-02 06:22:24.786543 by hdbnsutil -sr_register --remoteHost=lsh40402 --remoteInstance=00 --replicationMode=syncmem --operationMode=logreplay --name=DC1
[communication]
listeninterface = .internal

[ha_dr_provider_SAPHanaSR]
provider = SAPHanaSR
path = /hana/shared/myHooks
execution_order = 1

[system_replication]
timetravel_logreplay_mode = auto
operation_mode = logreplay
site_id = 1
site_name = DC1
register_secondaries_on_takeover = true
mode = primary
actual_mode = syncmem

[system_replication_site_masters]
2 = lsh40402:30001 lsh40404:30001

[trace]
ha_dr_saphanasr = info

5.7.2 HA/DR Hook SAPHanaSR.py

In addition to MTR, the HA/DR hook SAPHanaSR.py needs to be added to the global.ini file of the HANA instances controlled by the cluster.

(This is only required on dc1hana01 dc1hana02 dc1hana03 dc2hana01 dc2hana02 dc2hana03, but not on dc3hana)

[ha_dr_provider_SAPHanaSR]
provider = SAPHanaSR
path = /hana/shared/myHooks
execution_order = 1

[trace]
ha_dr_saphanasr = info

The SAPHanaSR.py script needs to be copied onto all nodes, using the following command (if /hana/shared is shared only once per site):

cp /usr/share/SAPHanaSR-ScaleOut/SAPHanaSR.py /hana/shared/myHooks/
chown mtradm:sapsys /hana/shared/myHooks/SAPHanaSR.py
5.7.3 Adding the sudoers config file

The pacemaker cluster inputs configuration updates into the CIB (cluster information base), using the command /usr/sbin/crm_attribute, which requires root access.
For the sidadm user to update the CIB (cluster information base), a new file will then be created to allow sudo execution of this command.

To enable access create a file /etc/sudoers.d/20-saphana on all the sites, which is site 1 and site 2 in this case with the following contents

mtradm ALL=(ALL) NOPASSWD:  /usr/sbin/crm_attribute -n hana_mtr_*

To check if the enabled permissions are working as expected, login to the <sid>adm user which is mtradm in this case and run the following command:

mtradm% sudo /usr/sbin/crm_attribute -n hana_mtr_*
scope=crm_config  name=hana_mtr_* value=(null)
Error performing operation: No such device or address

This example is a successful configuration with the correct parameter hana_mtr_* as shown above.

The second call is a configuration with the wrong permission. Execute the following commands:

[rh1adm@lsh40401: HDB00]# sudo /usr/sbin/crm_attribute -n hana_rh1_*
[sudo] password for rh1adm:

Notice that with the wrong parameter hana_rh1_* prompts the system to request a password which indicates that the sudoers entry is working.

5.8 Resource Configuration

This chapter shows examples of how to create SAPHanaTopology and SAPHanaController resources.

If the configuration looks for example like:

SIDR=MTR
INSTANCENR=00
NAME of 3rd site DC3
rsc_SAPHanaTopology_MTR_HDB00
rsc_SAPHana_MTR_HDB00
rsc_SAPHanaDR_MTR_HDB00

Example creating a SAPHanaTopology resource:

pcs resource create rsc_SAPHanaTopology_MTR_HDB00 SAPHanaTopology SID=MTR InstanceNumber=00 op methods interval=0s timeout=5 op monitor interval=10 timeout=600  --disabled

pcs resource clone rsc_SAPHanaTopology_MTR_HDB00 clone-node-max=1 interleave=true

Example creating a SAPHanaController resource:

pcs resource create rsc_SAPHana_MTR_HDB00 SAPHanaController SID=MTR \
InstanceNumber=00 PREFER_SITE_TAKEOVER=true \
DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=true \
op demote interval=0s timeout=320 op methods interval=0s timeout=5 \
op monitor interval=59 role="Master" timeout=700 op monitor interval=61 \
role="Slave" timeout=700 op promote interval=0 timeout=3600 \
op start interval=0 timeout=3600 op stop interval=0 timeout=3600 meta migration-threshold=4
 --disabled

pcs resource promotable rsc_SAPHana_RH1_HDB00 promoted-max=1 clone-node-max=1 interleave=true

Current help can be displayed with
pcs resource describe SAPHanaTopology and
pcs resource describe SAPHanaController

The resources are created with the option --disabled and should be started after all constraints are created.

The option meta migration-threshold defines the number of failures after a resource will be moved to a new node. For tests lower numbers are recommended.

For more information about migration-threshold please check Moving Resources Due to Failure.

Note: Resources created with --disabled must be started manually at a later stage.

5.9. Constraints for a Scale-Out MTR

The necessary constraints need to be added in order to isolate the 2 site configuration from the additional 3rd site.

The SAPHana resources may only run on site 1 and site 2 but not on site3.

The following constraints will need to be configured:

pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids  dc3hana01
pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids  dc3hana02
pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids  dc3hana03
pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids  majoritymaker

pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids dc3hana01
pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids dc3hana02
pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids dc3hana03
pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids mjoritymaker

The SAPHanaTopology resource needs to be started before the SAPHana resource.

Use the command pcs constraint order to check if the order constraint exists. If it is found to be missing, add the missing constraint with the following command:

pcs constraint order rsc_SAPHanaTopology_MTR_HDB00 then start rsc_SAPHana_MTR_HDB00
pcs constraint order rsc_SAPHana_MTR_HDB00 then start rsc_SAPHanaDR_MTR_HDB00

6. Upgrading to 3 site MTR

To upgrade an existing cluster the following steps steps are necessary:

  • Install HANA on site 3
  • Copy HANA keys from the primary HANA server
  • register HANA on site 3 as an additional secondary HANA instance
  • Add site 3 nodes to the Cluster
  • update the existing package resource-agents-sap-hana-scaleout (version 180 or higher)
  • add register_secondaries_on_takeover = true to global.ini
  • edit /etc/sudoers.d/20-saphana
  • add constraints for the 3rd site nodes for the resources SAPHanaController and SAPHanaTopology

7. Verifying the installation

Testing the cluster is recommended, but not mandatory.

To test the cluster the following steps are required:

  • Check if the cluster is running
  • Check if the resources have been started
  • Check if the databases are running
  • Check the status of system replication
  • Shut down the HANA instance and verify

7.1. Check if the cluster is running or start the cluster

pcs cluster status # This will show if all nodes are online
pcs cluster start -all # This will start the cluster on all nodes

Note: Fencing must be configured and tested. In order to obtain a solution that is as automated as possible, the cluster must be constantly activated, which will then enable the cluster to automatically start after a reboot. In a production environment, disabling the restart allows manual intervention.

Example of fencing a node pcs stonith fence dc2hana01. To enable or disable the cluster use the commands pcs cluster enable --all or pcs cluster disable --all.

7.2. Check if the resources are started

Use pcs resource to check the status of all resources.
If a resource is not running you can start it with the commands below:

pcs resource enable rsc_SAPHanaTopology_MTR_HDB00
pcs resource enable rsc_SAPHana_MTR_HDB00

7.3. Check if the databases are running

The running HANA instances can be checked with the commands below:

[mtradm@dc1hana01]# sapcontrol -nr $TINSTANCE -function GetProcessList

03.08.2021 18:13:55
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
hdbdaemon, HDB Daemon, GREEN, Running, 2021 08 03 14:43:33, 3:30:22, 4012592
hdbcompileserver, HDB Compileserver, GREEN, Running, 2021 08 03 14:44:29, 3:29:26, 4016801
hdbindexserver, HDB Indexserver-RH1, GREEN, Running, 2021 08 03 14:44:31, 3:29:24, 4016936
hdbnameserver, HDB Nameserver, GREEN, Running, 2021 08 03 14:43:34, 3:30:21, 4012616
hdbpreprocessor, HDB Preprocessor, GREEN, Running, 2021 08 03 14:44:29, 3:29:26, 4016804
hdbwebdispatcher, HDB Web Dispatcher, GREEN, Running, 2021 08 03 14:45:44, 3:28:11, 4018667
hdbxsengine, HDB XSEngine-RH1, GREEN, Running, 2021 08 03 14:44:31, 3:29:24, 4016939
[mtradm@dc1hana01]# echo $?
3

An output of code 3 means that the HANA instance has been successfully configured and is running.

If the database is not running, it can be started with the command below:

(mtradm)% sapcontrol -nr ${TINSTANCE} -function StartSystem HDB

The database on the primary and secondary site can be started by enabling the SAPHana resource. This will prompt the cluster to start the databases.

7.4. Checking the System Replication Status

The correct System Replication Status can be checked on the primary node as sidadm with the command below:

python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?

The expected results for the return code are:

  • NoHSR = 10
  • Error = 11
  • Unknown = 12
  • Initializing = 13
  • Syncing = 14
  • Active = 15

In most cases the system will revert with return code 15

7.5. Check Cluster Consistency

During the installation, the resources are sometimes started before the configuration is finally completed. This can lead to entries in the cluster information base (CIB), which can lead to incorrect behavior of the cluster configuration.
This can easily be checked and also corrected after the configuration has been completed.

If the name of the third site is for example DC3 you can check the CIB with the command:

[root@dc1hana01: ~]# cibadmin --query |grep 'DC3"'

The commando can be executed on any node in the cluster.

Usually the output of the command is empty. If there is still an error in the configuration, the output could look like this.

        <nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>
        <nvpair id="SAPHanaSR-hana_rh1_site_lss_DC3" name="hana_rh1_site_lss_DC3" value="4"/>
        <nvpair id="SAPHanaSR-hana_rh1_site_srr_DC3" name="hana_rh1_site_srr_DC3" value="S"/>
        <nvpair id="SAPHanaSR-hana_rh1_site_lpt_DC3" name="hana_rh1_site_lpt_DC3" value="30"/>
        <nvpair id="SAPHanaSR-hana_rh1_site_mns_DC3" name="hana_rh1_site_mns_DC3" value="dc3hana01"/>
          <nvpair id="nodes-7-hana_rh1_site" name="hana_rh1_site" value="DC3"/>
          <nvpair id="nodes-8-hana_rh1_site" name="hana_rh1_site" value="DC3"/>
          <nvpair id="nodes-9-hana_rh1_site" name="hana_rh1_site" value="DC3"/>

These entries can be removed with the following command:

cibadmin --delete --xml-text '<...>'

Please use '' because of the values contains ""

To remove the entries of the example above you have to enter for example:

cibadmin --delete --xml-text '        <nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>'
cibadmin --delete --xml-text '        <nvpair id="SAPHanaSR-hana_rh1_site_lss_DC3" name="hana_rh1_site_lss_DC3" value="4"/>'
cibadmin --delete --xml-text '        <nvpair id="SAPHanaSR-hana_rh1_site_srr_DC3" name="hana_rh1_site_srr_DC3" value="S"/>'
cibadmin --delete --xml-text '        <nvpair id="SAPHanaSR-hana_rh1_site_lpt_DC3" name="hana_rh1_site_lpt_DC3" value="30"/>'
cibadmin --delete --xml-text '        <nvpair id="SAPHanaSR-hana_rh1_site_mns_DC3" name="hana_rh1_site_mns_DC3" value="dc3hana01"/>'
cibadmin --delete --xml-text '          <nvpair id="nodes-7-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'
cibadmin --delete --xml-text '          <nvpair id="nodes-8-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'
cibadmin --delete --xml-text '          <nvpair id="nodes-9-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'
cibadmin --delete --xml-text '<nvpair id="nodes-9-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'

After deleting all the entries you can check it with cibadmin --query |grep 'DC3"'.
The response should be empty.

7.6. Failover Testing

It is recommended, that you have checked the consistency of the cluster.
A basic test to verify this is to stop the primary HANA database.
This can be done with sapcontrol -nr ${TINSTANCE} -function StopSystem HDB.
The command will return an immediate output, but it will take some time for the database to stop.

The primary will switch to the 2nd site, upon which the instance on the 3rd site will be automatically reregistered to the new primary.
If everything is up and running again, clean up the cluster and terminate the new primary node on the 2nd site. The primary will then be switched to the 1st site and the instance on the 3rd site will be re-registered to the 1st site.

7.7. Recovering from Failover Examples

In some cases, the failed database base system will return as the old primary and the replication relationships are inconsistent, and the environment will need to be reconfigured. This can occur even if the cluster has switched the primary and the 3rd site has updated the primary database server.

One way of solving this is to re-register this site as the new secondary.

A potential scenario that can occur is that there are still open connections on the existing site, which cannot be easily deleted. If this is the case, follow the steps below:

Remove the replication relations on the former primary with hdbnsutil -sr_disable --force
Register this node as secondary with hdbnsutil -sr_register --name=DC1 --remoteHost=dc2hana01 --remoteInstance=00 --replicationMode=sync --operationMode=logreplay --online
Check/start the database
check the systemReplicationStatus
If fixing a single relationship doesn’t work, put the cluster in maintenance mode pcs property set maintenance-mode=true and rebuild the replication relationship from scratch starting with the primary node.

Sometimes a secondary instance will still not come up. Should this occur, follow the steps below:
Unregister the site with hdbnsutil -sr_unregister --name=DCN
Start the inregistered database
Reregister the site to the primary database server with hdbnsutil -sr_register --name=DC1 --remoteHost=dc2hana01 --remoteInstance=00 --replicationMode=sync --operationMode=logreplay --online

8. Useful commands

For HANA:

sidadm% sapcontrol -nr 00 -function GetProcessList # displays on all of the HANA nodes whether HANA is running ($?=3)
sidadm% python systemReplicationStatus # on master node
sidadm% hdbnsutil -sr_state L# ists system replication state, on master and on all the clients

For the cluster:

root# crm_mon -1Arf # Provides an overview
root# pcs resource# Lists all resources and shows if they are running
root# pcs constraint --full# Lists all constraint ids which should be removed
root# pcs cluster start --all # This will start the cluster on all nodes
root# pcs cluster stop --all # This will stop the cluster on all nodes
root# pcs status --full
root# pcs note attribute# Lists node attributes
root# SAPHanaSR-monitor

9. Additional Resources

2 Comments

Hi Team , i have Similar Setup , System A & System B at primary Side .Both are configured with Pacemaker Cluster with Scaleup Setup. DR Site System C & System D . Both are in Cluster. Replication configured A to B , A to C and C to D. How should be configuration needs to be done . Shall i keep DR cluster in Maintenance mode all the till until Primary site fails .

I am not getting those parts of your comment : You say: "Both are in Cluster". Do you mean that NodeA and NodeB are in a Cluster, and that NodeB and NodeC are in a different one ? if that's the case then you have 2 distinct Clusters.. and the document does not apply to your case. And if indeed you are having 2 separate Clusters, this would be a very strange design, since the whole point of a Cluster is to automatically Failover and Failback the DB from Nodes that are part "of the same Cluster".

You say: "Replication configured A to B , A to C and C to D" Why would you replicate from NodeC to NodeD if they are in the same DataCenter ? it is unusual to see a replication from a "DR" towards another "DR".