Automating SAP HANA Multi Target System Replication in a Pacemaker-based cluster on Red Hat Enterprise Linux (RHEL)
Contents
- 1. Overview
- 2. Supported Scenarios
- 3. Parameters
- 4. Preconditions
- 5. Installation
- 5.1. Overview
- 5.2. Checking the 2 site base setup
- 5.3. HANA Installation
- 5.4. Registering the 3rd HANA instance as a new additional secondary
- 5.5. Adding the node of the third site to the cluster (optional)
- 5.6. Checking the cluster node on the 3rd site
- 5.7. Adding MTR support
- 5.8. Resource Configuration
- 5.9. Constraints for a Scale-Out MTR environment
- 6. Upgrading to 3 site MTR
- 7. Verifying the installation
- 8. Useful Commands
- 9. Additional Resources
1. Overview
This document describes how to configure Multi Target Replication (MTR) in a pacemaker-based automated SAP HANA system replication cluster on Red Hat Enterprise Linux (RHEL).
The basis of the setup is an Automated HANA System Replication cluster with 2 sites in a Scale-Out environment. See Red Hat Enterprise Linux HA Solution for SAP HANA Scale-Out and System Replication for more information.
2 secondary HANA instances are required.
All instances require the same amount of the following:
- RAM
- CPUs
- nodes
Scale-Out requires multiple nodes per HANA instance. See SAP HANA Administration Guide for SAP HANA Platform for more information.
Additionally, 3 sites are required:
- SITE 1 or DC1
- SITE 2 or DC2
- SITE 3 or DC3
The initial setup is as follows:
- Replicate Primary Site 1 (DC1) to Secondary Site 2 (DC2)
- Replicate Primary Site 1 (DC1) to Secondary Site 3 (DC3)
If the primary fails, the Secondary Site 2 (DC2) automatically becomes the new primary for Site 3 (DC3).
When failover occurs this solution ensures that the primary is switched at the 3rd site as well. The configuration after failover is as follows:
Assuming Site 1 ( DC1) fails:
- Replicate the new Primary Site 2(DC1) to Secondary Site 3(DC3)
If the Primary falls back to Site 1 (DC1), Site 3 (DC3) needs to be re-registered to Site 1(DC1) again.
2. Supported Scenarios
For more information about supported scenarios refer to Support Policies for RHEL High Availability Clusters - Management of SAP HANA in a Cluster.
3. Parameters
Parameter | Example | Description |
---|---|---|
SID | MTR | System ID of the HANA Database |
1st SITE | DC1 | Name of the 1st site where the 2nd secondary is running |
2nd SITE | DC2 | Name of the 2nd site where the 2nd secondary is running |
3rd SITE | DC3 | Name of the 3rd site where the 2nd secondary is running |
InstanceNr | 00 | HANA Instance Number |
4. Preconditions
To support MultiTargetReplication
you will need to install the following:
SAP HANA SPS04
or laterresource-agents-sap-hana-scaleout
version 180 or later.
The parameter register_secondaries_on_takeover is available for HANA 2 SPS04 and later versions.
This enables automatic registration of site 3 (DC3) to the new primary, should failover occur in the primary HANA server. It is based on Red Hat Enterprise Linux HA Solution for SAP HANA Scale Out and System Replication
The system replication of all the nodes are based on SAP requirements. For more information, refer to the guidelines from SAP based on the SAP HANA Administration Guide.
The nodes of the additional 3rd location (site) will then be added to the cluster.
5. Installation
The steps below describe how to install a 3-site HANA multi target replication cluster.
If your preconfiguration is a 2 site cluster setup, some steps will not be needed.
5.1. Overview
-
This section describes how to set up a 3 site cluster for a primary and two secondary SAP HANA instances (on different availability zones), or upgrade an existing 2 node cluster. For more information see 6. Upgrading to 3 site MTR.
-
Install HANA using the
hdblcm
utility - Copy the database keys from the primary server to the secondary database servers
- Register the secondary HANA instance
- Check the HANA replication status with
python systemReplicationStatus.py
orhdbnsutil -sr_state
- Add
register_secondaries_on_takeover=true
inglobal.ini
on primary and secondary instances - Edit
/etc/sudoers.d/20-saphana
- Add the nodes of the third site to the cluster, including
corosync
andfencing
- Add constraints for the 3rd site nodes for the resources
SAPHanaController
andSAPHanaTopology
- Verify that the installation is successful by running the necessary tests
5.2. Checking base setup
Note: Please ensure that you are using the correct resource-agents-sap-hana-scaleout
package on all of the cluster nodes.
While creating the cluster you can either include the additional nodes for the third site during this step, or add them later to your configuration.
The steps for adding the new nodes are as follows:
- install cluster packages
pcs pacemaker fence-agents
systemctl start pcsd.service
passwd hacluster
pcs host auth <nodename>
With this you can do the following:
pcs cluster setup ..
# create a clusterpcs cluster node add ..
# add nodes to an existing cluster
At least 2 internal networks are recommended to be configured between the nodes.
See Red Hat Enterprise Linux HA Solution for SAP HANA Scale Out and System Replication for further details on how to configure these resource packages.
5.3. HANA Installation
The HANA installation should be completed before creating the SAPHana
resources.
If you have already installed HANA on the primary node, you can continue to install HANA on the new secondary node using:
- SID
- InstanceNumber
- sidadm user id
The installation of the 3rd site is similar to the installation of the 1st secondary site.
To disable control of the SAP resource agents, you can stop the cluster with the following commands:
pcs cluster stop
pcs cluster stop --all
or stop all resources
pcs resource
# This will list all Resourcespcs resource disable <resourcename>
# This will stop (disable) the resources
If the filesystems are controlled by the cluster, the filesystems will need to be manually mounted.
To install HANA follow the steps below:
- Check if the cluster is stopped
- Check the mountpoints
- Install HANA on the 3rd site
- Copy the key files from that of the primary
- Register the secondary HANA node on DC3
Example output of HANA Installation:
DC1/DC2# pcs cluster stop --all
DC3# df
DC3# cd /sapcd/DATA_UNITS/HDB_SERVER_LINUX_X86_64
root@DC3:/sapcd252/DATA_UNITS/HDB_SERVER_LINUX_X86_64]# ./hdbuninst
# Option 0 will remove an already existing HANA Installation
# No SAP HANA Installation found is the expected answer
root@DC3:/sapcd252/DATA_UNITS/HDB_SERVER_LINUX_X86_64]# ./hdblcm
1 install
2 server
/hana/shared is default directory
Enger Local Hostname [dc3host]: use the default name
additional hosts only during Scale-Out Installation y default is n
ENTER SAP HANA System ID: MTR
Enter Instance Number [00]:
Enter Local Host Worker Group [default]:
Select System Usage / Enter Index [4]:
Choose encryption
Enter Location of Data Volumes [/hana/data/MTR]:
Enter Location of Log Volumes [/hana/log/MTR]:
Restrict maximum memory allocation? [n]:
Enter Certificate Host Name
Enter System Administrator (rmtadm) Password: <Y0urPasswd>
Confirm System Administrator (rmtadm) Password: <Y0urPasswd>
Enter System Administrator Home Directory [/usr/sap/RMT/home]:
Enter System Administrator Login Shell [/bin/sh]:
Enter System Administrator User ID [1000]:
Enter System Database User (SYSTEM) Password: <Y0urPasswd>
Confirm System Database User (SYSTEM) Password: <Y0urPasswd>
Restart system after machine reboot? [n]:
Before the installation starts a summary is listed:
SAP HANA Database System Installation
Installation Parameters
Remote Execution: ssh
Database Isolation: low
Install Execution Mode: standard
Installation Path: /hana/shared
Local Host Name: dc3host
SAP HANA System ID: MTR
Instance Number: 00
Local Host Worker Group: default
System Usage: custom
Location of Data Volumes: /hana/data/RMT
Location of Log Volumes: /hana/log/RMT
SAP HANA Database secure store: ssfs
Certificate Host Names: dc3host -> dc3host
System Administrator Home Directory: /usr/sap/RMT/home
System Administrator Login Shell: /bin/sh
System Administrator User ID: 1000
ID of User Group (sapsys): 1010
Software Components
SAP HANA Database
Install version 2.00.052.00.1599235305
Location: /sapcd252/DATA_UNITS/HDB_SERVER_LINUX_X86_64/server
SAP HANA Local Secure Store
Do not install
SAP HANA AFL (incl.PAL,BFL,OFL)
Do not install
SAP HANA EML AFL
Do not install
SAP HANA EPM-MDS
Do not install
SAP HANA Database Client
Do not install
SAP HANA Studio
Do not install
SAP HANA Smart Data Access
Do not install
SAP HANA XS Advanced Runtime
Do not install
Log File Locations
Log directory: /var/tmp/hdb_RMT_hdblcm_install_2021-06-09_18.48.13
Trace location: /var/tmp/hdblcm_2021-06-09_18.48.13_31307.trc
Do you want to continue? (y/n):
The command y
will start the installation.
If your Scale-Out architecture is running a HANA instance with multiple nodes, you will need to use the same mount point /hana/shared
per site. You can either install the other HANA nodes of a site by adding the hosts during the installation, or install the other HANA nodes once with the command /hana/shared/MTR/hdblcm/hdblcm
.
For more information on how to install SAP HANA Scale-Out nodes, refer to SAP HANA Administration Guide for SAP HANA Platform.
5.4. Registering the 3rd HANA instance as a new additional secondary
This step is similar to the registration of the first secondary HANA instance (DC2).
To check if the HANA instance on the primary is up and running run the command sapcontrol -nr $TINSTANCE -function GetProcessList; echo $?
If the return code is 3, this indicates that the HANA instance is successfully running on the node where the command was started.
To check if HANA System Replication is enabled on the primary system, run the following command:
> hdbnsutil -sr_state | grep ^mode:
mode: primary
To copy keys from the primary to the secondary site run the following command (in this example, we will assume that SID=MTR). Note that in scale out environments only one copy is necessary.
# scp -rp /usr/sap/MTR/SYS/global/security/rsecssfs/data/SSFS_RH1.DAT dc3hana01:/usr/sap/MTR/SYS/global/security/rsecssfs/data/SSFS_RH1.DAT
# scp -rp /usr/sap/MTR/SYS/global/security/rsecssfs/key/SSFS_RH1.KEY dc3hana01:/usr/sap/MTR/SYS/global/security/rsecssfs/key/SSFS_RH1.KEY
Register the HANA instance on the 3rd site as an additional secondary sid admin user with the following command:
mtradm@dc3hana01: hdbnsutil -sr_register --name=DC3 --remoteHost=dc1hana01 --remoteInstance=00 --replicationMode=sync --operationMode=logreplay --online
At this point the HANA instance should be up and running. If the instance is running and you don't want it to stop, you can use the option --online
, which will register the instance while it is online. The necessary restart (stop and start) of the instance will then be initiated.This will enable an automatic restart of the HANA instance.
Note: The --online
command will work on both offline and online databases.
5.5. Adding the node(s) of the third site to the cluster (optional)
Note: This section can be skipped if the site 3 nodes are already part of the cluster.
This chapter describes the necessary steps to add 3rd site nodes to an existing 2 site cluster.
If all of the HANA nodes have been installed, the new site can be integrated into the existing cluster. It is recommended to stop the existing SAP HANA resources first by running the following commands:
dc1hana01# pcs resource disable SAPHanaTopology_MTR_00-clone
dc1hana01# pcs resource disable SAPHana_MTR_00-clone
1. Add the DC3 nodes to the cluster and install the same software packages as the ones installed on the primary node:
Example:
yum install pacemaker pcs fence-agent resource-agents-sap-hana
2. Enable cluster services pcs, pacemaker and corosync:
systemctl enable pcs;systemctl start pcs
3. Set the password of the hacluster
user on all new nodes:
passwd hacluster
4. Authorize the new nodes:
pcs host auth dc3hana01
pcs host auth dc3hana02
5. Add the new nodes to the existing cluster:
pcs cluster node add dc3hana01
pcs cluster node add dc3hana02
If you have several networks, the add command syntax is similar to the create command.
5.6. Checking the cluster node(s) on the 3rd site
If the cluster has been started on all nodes you can check if all the nodes are online with
pcs status
or pcs status --full
You can start the nodes with the following commands:
pcs cluster start
or
pcs cluster start --all
.
You can also enable the cluster with
pcs cluster enable [--all]
.
The SAP HANA resource should be disabled until the constraints are defined. This can be checked with pcs resource
. The resources can be disabled with the following command:
pcs resource disable rsc_SAPHana_MTR_HDB00
pcs resource disable rsc_SAPHanaTopology_MTR_HDB00
5.7. Adding MTR support
The 3rd site must be automatically switched to the new primary when a failover occurs.
HANA 2.0 SPS04 provides a system replication option register_secondaries_on_takeover = true
, which enforces the attached side to reregister to the new primary.
5.7.1 Configuring Global_ini
This option needs to be added to the global.ini
parameter in the HANA instances on site 1 and site 2, which are managed by the pacemaker cluster.
Note: The global.ini
file should only be edited if the HANA instance of a site has stopped processing.
The global.ini
file can edited by the sidadm
(mtradm) user:
vim /usr/sap/${SAPSYSTEMNAME}/SYS/global/hdb/custom/config/global.ini
Example of global.ini
:
# global.ini last modified 2021-08-02 06:22:24.786543 by hdbnsutil -sr_register --remoteHost=lsh40402 --remoteInstance=00 --replicationMode=syncmem --operationMode=logreplay --name=DC1
[communication]
listeninterface = .internal
[ha_dr_provider_SAPHanaSR]
provider = SAPHanaSR
path = /hana/shared/myHooks
execution_order = 1
[system_replication]
timetravel_logreplay_mode = auto
operation_mode = logreplay
site_id = 1
site_name = DC1
register_secondaries_on_takeover = true
mode = primary
actual_mode = syncmem
[system_replication_site_masters]
2 = lsh40402:30001 lsh40404:30001
[trace]
ha_dr_saphanasr = info
5.7.2 HA/DR Hook SAPHanaSR.py
In addition to MTR, the HA/DR hook SAPHanaSR.py
needs to be added to the global.ini
file of the HANA instances controlled by the cluster.
(This is only required on dc1hana01 dc1hana02 dc1hana03 dc2hana01 dc2hana02 dc2hana03
, but not on dc3hana
)
[ha_dr_provider_SAPHanaSR]
provider = SAPHanaSR
path = /hana/shared/myHooks
execution_order = 1
[trace]
ha_dr_saphanasr = info
The SAPHanaSR.py
script needs to be copied onto all nodes, using the following command (if /hana/shared is shared only once per site):
cp /usr/share/SAPHanaSR-ScaleOut/SAPHanaSR.py /hana/shared/myHooks/
chown mtradm:sapsys /hana/shared/myHooks/SAPHanaSR.py
5.7.3 Adding the sudoers config
file
The pacemaker cluster inputs configuration updates into the CIB
(cluster information base), using the command /usr/sbin/crm_attribute
, which requires root
access.
For the sidadm
user to update the CIB (cluster information base), a new file will then be created to allow sudo
execution of this command.
To enable access create a file /etc/sudoers.d/20-saphana
on all the sites, which is site 1 and site 2 in this case with the following contents
mtradm ALL=(ALL) NOPASSWD: /usr/sbin/crm_attribute -n hana_mtr_*
To check if the enabled permissions are working as expected, login to the <sid>adm
user which is mtradm
in this case and run the following command:
mtradm% sudo /usr/sbin/crm_attribute -n hana_mtr_*
scope=crm_config name=hana_mtr_* value=(null)
Error performing operation: No such device or address
This example is a successful configuration with the correct parameter hana_mtr_*
as shown above.
The second call is a configuration with the wrong permission. Execute the following commands:
[rh1adm@lsh40401: HDB00]# sudo /usr/sbin/crm_attribute -n hana_rh1_*
[sudo] password for rh1adm:
Notice that with the wrong parameter hana_rh1_*
prompts the system to request a password which indicates that the sudoers entry is working.
5.8 Resource Configuration
This chapter shows examples of how to create SAPHanaTopology
and SAPHanaController
resources.
If the configuration looks for example like:
SIDR=MTR
INSTANCENR=00
NAME of 3rd site DC3
rsc_SAPHanaTopology_MTR_HDB00
rsc_SAPHana_MTR_HDB00
rsc_SAPHanaDR_MTR_HDB00
Example creating a SAPHanaTopology resource:
pcs resource create rsc_SAPHanaTopology_MTR_HDB00 SAPHanaTopology SID=MTR InstanceNumber=00 op methods interval=0s timeout=5 op monitor interval=10 timeout=600 --disabled
pcs resource clone rsc_SAPHanaTopology_MTR_HDB00 clone-node-max=1 interleave=true
Example creating a SAPHanaController resource:
pcs resource create rsc_SAPHana_MTR_HDB00 SAPHanaController SID=MTR \
InstanceNumber=00 PREFER_SITE_TAKEOVER=true \
DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=true \
op demote interval=0s timeout=320 op methods interval=0s timeout=5 \
op monitor interval=59 role="Master" timeout=700 op monitor interval=61 \
role="Slave" timeout=700 op promote interval=0 timeout=3600 \
op start interval=0 timeout=3600 op stop interval=0 timeout=3600 meta migration-threshold=4
--disabled
pcs resource promotable rsc_SAPHana_RH1_HDB00 promoted-max=1 clone-node-max=1 interleave=true
Current help can be displayed with
pcs resource describe SAPHanaTopology
and
pcs resource describe SAPHanaController
The resources are created with the option --disabled and should be started after all constraints are created.
The option meta migration-threshold
defines the number of failures after a resource will be moved to a new node. For tests lower numbers are recommended.
For more information about migration-threshold please check Moving Resources Due to Failure.
Note: Resources created with --disabled
must be started manually at a later stage.
5.9. Constraints for a Scale-Out MTR
The necessary constraints need to be added in order to isolate the 2 site configuration from the additional 3rd site.
The SAPHana
resources may only run on site 1 and site 2 but not on site3.
The following constraints will need to be configured:
pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids dc3hana01
pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids dc3hana02
pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids dc3hana03
pcs constraint location rsc_SAPHanaTopology_MTR_HDB00-clone avoids majoritymaker
pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids dc3hana01
pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids dc3hana02
pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids dc3hana03
pcs constraint location rsc_SAPHana_MTR_HDB00-clone avoids mjoritymaker
The SAPHanaTopology
resource needs to be started before the SAPHana
resource.
Use the command pcs constraint order
to check if the order constraint exists. If it is found to be missing, add the missing constraint with the following command:
pcs constraint order rsc_SAPHanaTopology_MTR_HDB00 then start rsc_SAPHana_MTR_HDB00
pcs constraint order rsc_SAPHana_MTR_HDB00 then start rsc_SAPHanaDR_MTR_HDB00
6. Upgrading to 3 site MTR
To upgrade an existing cluster the following steps steps are necessary:
- Install HANA on site 3
- Copy HANA keys from the primary HANA server
- register HANA on site 3 as an additional secondary HANA instance
- Add site 3 nodes to the Cluster
- update the existing package resource-agents-sap-hana-scaleout (version 180 or higher)
- add
register_secondaries_on_takeover = true
to global.ini - edit
/etc/sudoers.d/20-saphana
- add constraints for the 3rd site nodes for the resources
SAPHanaController
andSAPHanaTopology
7. Verifying the installation
Testing the cluster is recommended, but not mandatory.
To test the cluster the following steps are required:
- Check if the cluster is running
- Check if the resources have been started
- Check if the databases are running
- Check the status of system replication
- Shut down the HANA instance and verify
7.1. Check if the cluster is running or start the cluster
pcs cluster status # This will show if all nodes are online
pcs cluster start -all # This will start the cluster on all nodes
Note: Fencing must be configured and tested. In order to obtain a solution that is as automated as possible, the cluster must be constantly activated, which will then enable the cluster to automatically start after a reboot. In a production environment, disabling the restart allows manual intervention.
Example of fencing a node pcs stonith fence dc2hana01
. To enable or disable the cluster use the commands pcs cluster enable --all
or pcs cluster disable --all
.
7.2. Check if the resources are started
Use pcs resource
to check the status of all resources.
If a resource is not running you can start it with the commands below:
pcs resource enable rsc_SAPHanaTopology_MTR_HDB00
pcs resource enable rsc_SAPHana_MTR_HDB00
7.3. Check if the databases are running
The running HANA instances can be checked with the commands below:
[mtradm@dc1hana01]# sapcontrol -nr $TINSTANCE -function GetProcessList
03.08.2021 18:13:55
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
hdbdaemon, HDB Daemon, GREEN, Running, 2021 08 03 14:43:33, 3:30:22, 4012592
hdbcompileserver, HDB Compileserver, GREEN, Running, 2021 08 03 14:44:29, 3:29:26, 4016801
hdbindexserver, HDB Indexserver-RH1, GREEN, Running, 2021 08 03 14:44:31, 3:29:24, 4016936
hdbnameserver, HDB Nameserver, GREEN, Running, 2021 08 03 14:43:34, 3:30:21, 4012616
hdbpreprocessor, HDB Preprocessor, GREEN, Running, 2021 08 03 14:44:29, 3:29:26, 4016804
hdbwebdispatcher, HDB Web Dispatcher, GREEN, Running, 2021 08 03 14:45:44, 3:28:11, 4018667
hdbxsengine, HDB XSEngine-RH1, GREEN, Running, 2021 08 03 14:44:31, 3:29:24, 4016939
[mtradm@dc1hana01]# echo $?
3
An output of code 3 means that the HANA instance has been successfully configured and is running.
If the database is not running, it can be started with the command below:
(mtradm)% sapcontrol -nr ${TINSTANCE} -function StartSystem HDB
The database on the primary and secondary site can be started by enabling the SAPHana resource. This will prompt the cluster to start the databases.
7.4. Checking the System Replication Status
The correct System Replication Status can be checked on the primary node as sidadm
with the command below:
python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?
The expected results for the return code are:
- NoHSR = 10
- Error = 11
- Unknown = 12
- Initializing = 13
- Syncing = 14
- Active = 15
In most cases the system will revert with return code 15
7.5. Check Cluster Consistency
During the installation, the resources are sometimes started before the configuration is finally completed. This can lead to entries in the cluster information base (CIB), which can lead to incorrect behavior of the cluster configuration.
This can easily be checked and also corrected after the configuration has been completed.
If the name of the third site is for example DC3
you can check the CIB with the command:
[root@dc1hana01: ~]# cibadmin --query |grep 'DC3"'
The commando can be executed on any node in the cluster.
Usually the output of the command is empty. If there is still an error in the configuration, the output could look like this.
<nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>
<nvpair id="SAPHanaSR-hana_rh1_site_lss_DC3" name="hana_rh1_site_lss_DC3" value="4"/>
<nvpair id="SAPHanaSR-hana_rh1_site_srr_DC3" name="hana_rh1_site_srr_DC3" value="S"/>
<nvpair id="SAPHanaSR-hana_rh1_site_lpt_DC3" name="hana_rh1_site_lpt_DC3" value="30"/>
<nvpair id="SAPHanaSR-hana_rh1_site_mns_DC3" name="hana_rh1_site_mns_DC3" value="dc3hana01"/>
<nvpair id="nodes-7-hana_rh1_site" name="hana_rh1_site" value="DC3"/>
<nvpair id="nodes-8-hana_rh1_site" name="hana_rh1_site" value="DC3"/>
<nvpair id="nodes-9-hana_rh1_site" name="hana_rh1_site" value="DC3"/>
These entries can be removed with the following command:
cibadmin --delete --xml-text '<...>'
Please use ''
because of the values contains ""
To remove the entries of the example above you have to enter for example:
cibadmin --delete --xml-text ' <nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>'
cibadmin --delete --xml-text ' <nvpair id="SAPHanaSR-hana_rh1_site_lss_DC3" name="hana_rh1_site_lss_DC3" value="4"/>'
cibadmin --delete --xml-text ' <nvpair id="SAPHanaSR-hana_rh1_site_srr_DC3" name="hana_rh1_site_srr_DC3" value="S"/>'
cibadmin --delete --xml-text ' <nvpair id="SAPHanaSR-hana_rh1_site_lpt_DC3" name="hana_rh1_site_lpt_DC3" value="30"/>'
cibadmin --delete --xml-text ' <nvpair id="SAPHanaSR-hana_rh1_site_mns_DC3" name="hana_rh1_site_mns_DC3" value="dc3hana01"/>'
cibadmin --delete --xml-text ' <nvpair id="nodes-7-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'
cibadmin --delete --xml-text ' <nvpair id="nodes-8-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'
cibadmin --delete --xml-text ' <nvpair id="nodes-9-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'
cibadmin --delete --xml-text '<nvpair id="nodes-9-hana_rh1_site" name="hana_rh1_site" value="DC3"/>'
After deleting all the entries you can check it with cibadmin --query |grep 'DC3"'
.
The response should be empty.
7.6. Failover Testing
It is recommended, that you have checked the consistency of the cluster.
A basic test to verify this is to stop the primary HANA database.
This can be done with sapcontrol -nr ${TINSTANCE} -function StopSystem HDB
.
The command will return an immediate output, but it will take some time for the database to stop.
The primary will switch to the 2nd site, upon which the instance on the 3rd site will be automatically reregistered to the new primary.
If everything is up and running again, clean up the cluster and terminate the new primary node on the 2nd site. The primary will then be switched to the 1st site and the instance on the 3rd site will be re-registered to the 1st site.
7.7. Recovering from Failover Examples
In some cases, the failed database base system will return as the old primary and the replication relationships are inconsistent, and the environment will need to be reconfigured. This can occur even if the cluster has switched the primary and the 3rd site has updated the primary database server.
One way of solving this is to re-register this site as the new secondary.
A potential scenario that can occur is that there are still open connections on the existing site, which cannot be easily deleted. If this is the case, follow the steps below:
Remove the replication relations on the former primary with hdbnsutil -sr_disable --force
Register this node as secondary with hdbnsutil -sr_register --name=DC1 --remoteHost=dc2hana01 --remoteInstance=00 --replicationMode=sync --operationMode=logreplay --online
Check/start the database
check the systemReplicationStatus
If fixing a single relationship doesn’t work, put the cluster in maintenance mode pcs property set maintenance-mode=true
and rebuild the replication relationship from scratch starting with the primary node.
Sometimes a secondary instance will still not come up. Should this occur, follow the steps below:
Unregister the site with hdbnsutil -sr_unregister --name=DCN
Start the inregistered database
Reregister the site to the primary database server with hdbnsutil -sr_register --name=DC1 --remoteHost=dc2hana01 --remoteInstance=00 --replicationMode=sync --operationMode=logreplay --online
8. Useful commands
For HANA:
sidadm% sapcontrol -nr 00 -function GetProcessList
# displays on all of the HANA nodes whether HANA is running ($?=3)
sidadm% python systemReplicationStatus
# on master node
sidadm% hdbnsutil -sr_state
L# ists system replication state, on master and on all the clients
For the cluster:
root# crm_mon -1Arf
# Provides an overview
root# pcs resource
# Lists all resources and shows if they are running
root# pcs constraint --full
# Lists all constraint ids which should be removed
root# pcs cluster start --all
# This will start the cluster on all nodes
root# pcs cluster stop --all
# This will stop the cluster on all nodes
root# pcs status --full
root# pcs note attribute
# Lists node attributes
root# SAPHanaSR-monitor
2 Comments
Hi Team , i have Similar Setup , System A & System B at primary Side .Both are configured with Pacemaker Cluster with Scaleup Setup. DR Site System C & System D . Both are in Cluster. Replication configured A to B , A to C and C to D. How should be configuration needs to be done . Shall i keep DR cluster in Maintenance mode all the till until Primary site fails .
I am not getting those parts of your comment : You say: "Both are in Cluster". Do you mean that NodeA and NodeB are in a Cluster, and that NodeB and NodeC are in a different one ? if that's the case then you have 2 distinct Clusters.. and the document does not apply to your case. And if indeed you are having 2 separate Clusters, this would be a very strange design, since the whole point of a Cluster is to automatically Failover and Failback the DB from Nodes that are part "of the same Cluster".
You say: "Replication configured A to B , A to C and C to D" Why would you replicate from NodeC to NodeD if they are in the same DataCenter ? it is unusual to see a replication from a "DR" towards another "DR".