Automating SAP HANA Scale-Up System Replication using the RHEL HA Add-On
Note: For guidelines on how to set up a RHEL HA Add-On based cluster for managing SAP HANA Scale-Up System Replication on RHEL 8, please use the version of the documentation available in the RHEL 8 for SAP Solutions product documentation: Automating SAP HANA Scale-Up System Replication using the RHEL HA Add-On.
Note: For guidelines on how to set up a RHEL HA Add-On based cluster for managing SAP HANA Scale-Up System Replication on RHEL 9, please use the version of the documentation available in the RHEL 9 for SAP Solutions product documentation: Automating SAP HANA Scale-Up System Replication using the RHEL HA Add-On.
Contents
- 1. Overview
- 2. SAP HANA System Replication
- 3. Configuring monitoring account in SAP HANA for cluster resource agents (SAP HANA 1.0 SPS12 and earlier)
- 4. Configuring SAP HANA in a pacemaker cluster
- 4.1. Install resource agents and other components required for managing SAP HANA Scale-Up System Replication using the RHEL HA Add-On
- 4.2. Enable the SAP HANA srConnectionChanged() hook
- 4.3. Configure general cluster properties
- 4.4. Create cloned SAPHanaTopology resource
- 4.5. Create Master/Slave SAPHana resource
- 4.6. Create Virtual IP address resource
- 4.7. Create constraints
- 4.8. Adding a secondary virtual IP address for an Active/Active (Read-Enabled) HANA System Replication setup
- 4.9. Testing the manual move of SAPHana resource to another node (SAP Hana takeover by cluster)
1. Overview
This article describes how to configure Automated HANA System Replication in Scale-Up in a Pacemaker cluster on supported RHEL releases.
This article does NOT cover preparation of a RHEL system for SAP HANA installation nor the SAP HANA installation procedure. For more details on these topics refer to SAP Note 2009879 - SAP HANA Guidelines for RedHat Enterprise Linux (RHEL).
1.1. Supported scenarios
See: Support Policies for RHEL High Availability Clusters - Management of SAP HANA in a Cluster
1.2. Subscription and Repos
The following repos are required:
RHEL 7.x
- RHEL Server: provides the RHEL kernel packages
- RHEL HA Add-On: provides the Pacemaker framework
- RHEL for SAP HANA: provides the resource agents for the automation of HANA System Replication in Scale-Up
1.2.1. On-Premise or Bring Your Own Subscription through Cloud Access
For on-premise or Bring Your Own Subscription through Red Hat Cloud Access, the subscription to use is RHEL for SAP Solutions.
RHEL 7.x: below is the example of repos enabled with RHEL for SAP Solutions 7.6, on-premise or through Cloud Access:
# yum repolist
repo id repo name status
rhel-7-server-e4s-rpms/7Server/x86_64 Red Hat Enterprise Linux 7 Server - Update Services for SAP Solutions (RPMs) 18,929
rhel-ha-for-rhel-7-server-e4s-rpms/7Server/x86_64 Red Hat Enterprise Linux High Availability (for RHEL 7 Server) Update Services for SAP Solutions (RPMs) 437
rhel-sap-hana-for-rhel-7-server-e4s-rpms/7Server/x86_64 RHEL for SAP HANA (for RHEL 7 Server) Update Services for SAP Solutions (RPMs) 38
1.2.2. On-Demand on Public Clouds through RHUI
For deployment in on-demand images on public cloud, the software packages are delivered in Red Hat Enterprise Linux for SAP with High Availability and Update Services
, a variant of RHEL for SAP Solutions, customized for public clouds, available through RHUI.
Below is the example of repos enabled on a RHUI system with RHEL for SAP with High Availability and Update Services
7.5. For configuration of Automated HANA System Replication in Scale-Up, the following repos must present:
# yum repolist
repo id repo name status
rhui-rhel-7-server-rhui-eus-rpms/7.5/x86_64 Red Hat Enterprise Linux 7 Server - Extended Update Support (RPMs) from RH 21,199
rhui-rhel-ha-for-rhel-7-server-eus-rhui-rpms/7.5/x86_64 Red Hat Enterprise Linux High Availability from RHUI (for RHEL 7 Server) - 501
rhui-rhel-sap-hana-for-rhel-7-server-eus-rhui-rpms/7.5/x86_64 RHEL for SAP HANA (for RHEL 7 Server) Extended Update Support (RPMs) from 43
2. SAP HANA System Replication
The following example shows how to set up system replication between 2 nodes running SAP HANA.
Configuration used in the example:
SID: RH2
Instance Number: 02
node1 FQDN: node1.example.com
node2 FQDN: node2.example.com
node1 HANA site name: DC1
node2 HANA site name: DC2
SAP HANA 'SYSTEM' user password: <HANA_SYSTEM_PASSWORD>
SAP HANA administrative user: rh2adm
Ensure that both systems can resolve the FQDN of both systems without issues. To ensure that FQDNs can be resolved even without DNS you can place them into /etc/hosts
like in the example below.
# /etc/hosts
192.168.0.11 node1.example.com node1
192.168.0.12 node2.example.com node2
For the system replication to work, the SAP HANA log_mode
variable must be set to normal
. This can be verified as HANA system user using the command below on both nodes.
[rh2adm]# hdbsql -u system -p <HANA_SYSTEM_PASSWORD> -i 02 "select value from "SYS"."M_INIFILE_CONTENTS" where key='log_mode'"
VALUE "normal"
1 row selected
Note that later configuration of primary and secondary node is used only during setup. The roles (primary/secondary) may change during cluster operation based on cluster configuration.
A lot of the configuration steps are performed from the SAP HANA administrative user on the system whose name was selected during installation. In examples we will use rh2adm
as we use SID RH2
. To become the SAP HANA administrative user you can use the command below.
[root]# sudo -i -u rh2adm
[rh2adm]#
2.1. Configure HANA primary node
SAP HANA system replication will only work after initial backup has been performed. The following command will create an initial backup in /tmp/foo
directory. Please note that the size of the backup depends on the database size and may take some time to complete. The directory to which the backup will be placed must by writeable by the SAP HANA administrative user.
a) On single container systems following command can be used for backup:
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "BACKUP DATA USING FILE ('/tmp/foo')"
0 rows affected (overall time xx.xxx sec; server time xx.xxx sec)
b) On multiple container systems (MDC) SYSTEMDB
and all tenant databases needs to be backed up:
Example below is on the backup of SYSTEMDB
. Please check SAP documentation on how to backup tenant databases.
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> -d SYSTEMDB "BACKUP DATA USING FILE ('/tmp/foo')"
0 rows affected (overall time xx.xxx sec; server time xx.xxx sec)
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> -d SYSTEMDB "BACKUP DATA FOR RH2 USING FILE ('/tmp/foo-RH2')"
0 rows affected (overall time xx.xxx sec; server time xx.xxx sec)
After the initial backup, initialize the replication using the command below.
[rh2adm]# hdbnsutil -sr_enable --name=DC1
checking for active nameserver ...
nameserver is active, proceeding ...
successfully enabled system as system replication source site
done.
Verify that initialization is showing current node as 'primary' and that SAP HANA is running on it.
[rh2adm]# hdbnsutil -sr_state
checking for active or inactive nameserver ...
System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~
mode: primary
site id: 1
site name: DC1
Host Mappings:
2.2. Configure HANA secondary node
Secondary node needs to be registered to, now running, primary node. SAP HANA on the secondary node must be shut down before using the command bellow.
[rh2adm]# HDB stop
(SAP HANA2.0 only) Copy the SAP HANA system PKI SSFS_RH2.KEY
and SSFS_RH2.DAT
files from primary node to secondary node.
[rh2adm]# scp root@node1:/usr/sap/RH2/SYS/global/security/rsecssfs/key/SSFS_RH2.KEY /usr/sap/RH2/SYS/global/security/rsecssfs/key/SSFS_RH2.KEY
[rh2adm]# scp root@node1:/usr/sap/RH2/SYS/global/security/rsecssfs/data/SSFS_RH2.DAT /usr/sap/RH2/SYS/global/security/rsecssfs/data/SSFS_RH2.DAT
To register secondary node use the command below.
[rh2adm]# hdbnsutil -sr_register --remoteHost=node1 --remoteInstance=02 --replicationMode=syncmem --name=DC2
adding site ...
checking for inactive nameserver ...
nameserver node2:30201 not responding.
collecting information ...
updating local ini files ...
done.
Start SAP HANA on the secondary node.
[rh2adm]# HDB start
Verify that the secondary node is running and that 'mode' is syncmem
. Output should look similar to the output below.
[rh2adm]# hdbnsutil -sr_state
checking for active or inactive nameserver ...
System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~
mode: syncmem
site id: 2
site name: DC2
active primary site: 1
Host Mappings:
~~~~~~~~~~~~~~
node2 -> [DC1] node1
node2 -> [DC2] node2
2.3. Testing SAP HANA System Replication
To manually test the SAP HANA System Replication setup you can follow the procedure described in following SAP documents:
- SAP HANA 1.0: chapter "8. Testing" - How to Perform System Replication for SAP HANA 1.0 guide
- SAP HANA 2.0: chapter "9. Testing" - How to Perform System Replication for SAP HANA 2.0 guide
2.4. Checking SAP HANA System Replication state
To check the current state of SAP HANA System Replication you can execute the following command as the SAP HANA administrative user on current primary SAP HANA node.
On single_container system:
[rh2adm]# python /usr/sap/RH2/HDB02/exe/python_support/systemReplicationStatus.py
| Host | Port | Service Name | Volume ID | Site ID | Site Name | Secondary | Secondary | Secondary | Secondary | Secondary | Replication | Replication | Replication |
| | | | | | | Host | Port | Site ID | Site Name | Active Status | Mode | Status | Status Details |
| ----- | ----- | ------------ | --------- | ------- | --------- | --------- | --------- | --------- | --------- | ------------- | ----------- | ----------- | -------------- |
| node1 | 30201 | nameserver | 1 | 1 | DC1 | node2 | 30201 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| node1 | 30207 | xsengine | 2 | 1 | DC1 | node2 | 30207 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| node1 | 30203 | indexserver | 3 | 1 | DC1 | node2 | 30203 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
On multiple_containers system (MDC):
[rh2adm]# python /usr/sap/RH2/HDB02/exe/python_support/systemReplicationStatus.py
| Database | Host | Port | Service Name | Volume ID | Site ID | Site Name | Secondary | Secondary | Secondary | Secondary | Secondary | Replication | Replication | Replication |
| | | | | | | | Host | Port | Site ID | Site Name | Active Status | Mode | Status | Status Details |
| -------- | ----- | ----- | ------------ | --------- | ------- | --------- | ----------| --------- | --------- | --------- | ------------- | ----------- | ----------- | -------------- |
| SYSTEMDB | node1 | 30201 | nameserver | 1 | 1 | DC1 | node2 | 30201 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| RH2 | node1 | 30207 | xsengine | 2 | 1 | DC1 | node2 | 30207 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
| RH2 | node1 | 30203 | indexserver | 3 | 1 | DC1 | node2 | 30203 | 2 | DC2 | YES | SYNCMEM | ACTIVE | |
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
3. Configuring monitoring account in SAP HANA for cluster resource agents (SAP HANA 1.0 SPS12 and earlier)
Starting with SAP HANA 2.0 SPS0 monitoring account is no longer needed
A technical user with CATALOG READ
and MONITOR ADMIN
privileges must exist in SAP HANA for the resource agents to be able to run queries on the system replication status. The example below shows how to create such a user, assign him the correct permissions and disable password expiration for this user.
monitoring user username: rhelhasync
monitoring user password: <MONITORING_USER_PASSWORD>
3.1. Creating monitoring user
When SAP HANA System replication is active then only the primary system is able to access the database. Accessing the secondary system will fail.
On the primary system run the following commands to create the monitoring user.
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "create user rhelhasync password \"<MONITORING_USER_PASSWORD>\""
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "grant CATALOG READ to rhelhasync"
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "grant MONITOR ADMIN to rhelhasync"
[rh2adm]# hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "ALTER USER rhelhasync DISABLE PASSWORD LIFETIME"
3.2. Store monitoring user credentials on all nodes
The SAP HANA userkey allows the "root" user on OS level to access SAP HANA via monitoring user without asking for password. This is needed by resource agents so they can run queries on HANA System Replication status.
[root]# /usr/sap/RH2/HDB02/exe/hdbuserstore SET SAPHANARH2SR localhost:30215 rhelhasync "<MONITORING_USER_PASSWORD>"
To verify that the userkey has been created correctly in root's userstore, you can run hdbuserstore list
command on each node and check if the monitoring account is present in the output as shown below:
[root]# /usr/sap/RH2/HDB02/exe/hdbuserstore list
DATA FILE : /root/.hdb/node1/SSFS_HDB.DAT
KEY FILE : /root/.hdb/node1/SSFS_HDB.KEY
KEY SAPHANARH2SR
ENV : localhost:30215
USER: rhelhasync
Please also verify that it is possible to run hdbsql commands as root using the SAPHANA
[root]# /usr/sap/RH2/HDB02/exe/hdbsql -U SAPHANARH2SR -i 02 "select distinct REPLICATION_STATUS from SYS.M_SERVICE_REPLICATION"
REPLICATION_STATUS
"ACTIVE"
1 row selected
If you get an error message about issues with the password or if you are prompted for a password please verify with hdbsql
command or HANA Studio that the password for the user created with the hdbsql
commands above is not configured 'to be changed on first login' or that the password has not expired. You can use the command below.
(Note: be sure to use the name of monitoring user in capital letters)
[root]# /usr/sap/RH2/HDB02/exe/hdbsql -i 02 -u system -p <HANA_SYSTEM_PASSWORD> "select * from sys.users where USER_NAME='RHELHASYNC'"
USER_NAME,USER_ID,USER_MODE,EXTERNAL_IDENTITY,CREATOR,CREATE_TIME,VALID_FROM,VALID_UNTIL,LAST_SUCCESSFUL_CONNECT,LAST_INVALID_CONNECT_ATTEMPT,INVALID_CONNECT_A
TTEMPTS,ADMIN_GIVEN_PASSWORD,LAST_PASSWORD_CHANGE_TIME,PASSWORD_CHANGE_NEEDED,IS_PASSWORD_LIFETIME_CHECK_ENABLED,USER_DEACTIVATED,DEACTIVATION_TIME,IS_PASSWORD
_ENABLED,IS_KERBEROS_ENABLED,IS_SAML_ENABLED,IS_X509_ENABLED,IS_SAP_LOGON_TICKET_ENABLED,IS_SAP_ASSERTION_TICKET_ENABLED,IS_RESTRICTED,IS_CLIENT_CONNECT_ENABLE
D,HAS_REMOTE_USERS,PASSWORD_CHANGE_TIME
"RHELHASYNC",156529,"LOCAL",?,"SYSTEM","2017-05-12 15:10:49.971000000","2017-05-12 15:10:49.971000000",?,"2017-05-12 15:21:12.117000000",?,0,"TRUE","2017-05-12
15:10:49.971000000","FALSE","FALSE","FALSE",?,"TRUE","FALSE","FALSE","FALSE","FALSE","FALSE","FALSE","TRUE","FALSE",?
1 row selected
4. Configuring SAP HANA in a pacemaker cluster
Please refer to the following documentation to first set up a pacemaker cluster. Note that the cluster must conform to article Support Policies for RHEL High Availability Clusters - General Requirements for Fencing/STONITH.
- Reference Document for the High Availability Add-On for Red Hat Enterprise Linux 7
- How can I configure power fencing for the IBM POWER platform using an HMC in a RHEL High Availability cluster?
This guide will assume that following things are working properly:
- Pacemaker cluster is configured according to documentation and has proper and working fencing
- SAP HANA startup on boot is disabled on all cluster nodes as the start and stop will be managed by the cluster
- SAP HANA system replication and takeover using tools from SAP are working properly between cluster nodes
- Both nodes are subscribed to the required channels:
- RHEL 7: 'High-availability' and 'RHEL for SAP HANA' (https://access.redhat.com/solutions/2334521)) channels
4.1. Install resource agents and other components required for managing SAP HANA Scale-Up System Replication using the RHEL HA Add-On
[root]# yum install resource-agents-sap-hana
Note: this will only install the resource agents and additional components required to set up this HA solution. The configuration steps documented in the following sections must still be carried out for a fully operable setup that is supported by Red Hat.
4.2. Enable the SAP HANA srConnectionChanged() hook
As documented in SAP's Implementing a HA/DR Provider, recent versions of SAP HANA provide so called "hooks" that allow SAP HANA to send out notifications for certain events. The srConnectionChanged()
hook can be used to improve the ability of the cluster to detect when a change in the status of the HANA System Replication occurs that requires the cluster to take action, and to avoid data loss/data corruption by preventing accidental takeovers to be triggered in situations where this should be avoided. When using SAP HANA 2.0 SPS0 or later and a version of the resource-agents-sap-hana
that provides the components for supporting the srConnectionChanged()
hook it is required to enable the hook before proceeding with the cluster setup.
4.2.1. Verify that a version of the resource-agents-sap-hana
package is installed that provides the components to enable the srConnectionChanged() hook
Please verify that the correct version of the resource-agents-sap-hana
package providing the components required to enable the srConnectionChanged()
hook for your version of RHEL is installed as documented in the following article: Is the srConnectionChanged() hook supported with the Red Hat High Availability solution for SAP HANA Scale-up System Replication?
4.2.2. Activate the srConnectionChanged() hook on all SAP HANA instances
Note: the steps to activate the srConnectionChanged()
hook need to be performed for each SAP HANA instance.
-
Stop the cluster on both nodes and verify that the HANA instances are stopped completely.
[root]# pcs cluster stop --all
-
Install the hook script into the
/hana/shared/myHooks
directory for each HANA instance and make sure it has the correct ownership on all nodes (replacerh2adm
with the username of the admin user of the HANA instances).[root]# mkdir -p /hana/shared/myHooks [root]# cp /usr/share/SAPHanaSR/srHook/SAPHanaSR.py /hana/shared/myHooks [root]# chown -R rh2adm:sapsys /hana/shared/myHooks
-
Update the
global.ini
file on each node to enable use of the hook script by both HANA instances (e.g., in file/hana/shared/RH2/global/hdb/custom/config/global.ini
):[ha_dr_provider_SAPHanaSR] provider = SAPHanaSR path = /hana/shared/myHooks execution_order = 1 [trace] ha_dr_saphanasr = info
-
On each cluster node create the file
/etc/sudoers.d/20-saphana
by runningsudo visudo -f /etc/sudoers.d/20-saphana
and add the contents below to allow the hook script to update the node attributes when thesrConnectionChanged()
hook is called.
Replacerh2
with the lowercase SID of your HANA installation and replaceDC1
andDC2
with your HANA site names.Cmnd_Alias DC1_SOK = /usr/sbin/crm_attribute -n hana_rh2_site_srHook_DC1 -v SOK -t crm_config -s SAPHanaSR Cmnd_Alias DC1_SFAIL = /usr/sbin/crm_attribute -n hana_rh2_site_srHook_DC1 -v SFAIL -t crm_config -s SAPHanaSR Cmnd_Alias DC2_SOK = /usr/sbin/crm_attribute -n hana_rh2_site_srHook_DC2 -v SOK -t crm_config -s SAPHanaSR Cmnd_Alias DC2_SFAIL = /usr/sbin/crm_attribute -n hana_rh2_site_srHook_DC2 -v SFAIL -t crm_config -s SAPHanaSR rh2adm ALL=(ALL) NOPASSWD: DC1_SOK, DC1_SFAIL, DC2_SOK, DC2_SFAIL Defaults!DC1_SOK, DC1_SFAIL, DC2_SOK, DC2_SFAIL !requiretty
For further information on why the
Defaults
setting is needed see The srHook attribute is set to SFAIL in a Pacemaker cluster managing SAP HANA system replication, even though replication is in a healthy state. -
Start both HANA instances manually without starting the cluster.
-
Verify that the hook script is working as expected. Perform some action to trigger the hook, such as stopping a HANA instance. Then check whether the hook logged anything using a method such as the one below.
[rh2adm]# cdtrace [rh2adm]# awk '/ha_dr_SAPHanaSR.*crm_attribute/ { printf "%s %s %s %s\n",$2,$3,$5,$16 }' nameserver_* 2018-05-04 12:34:04.476445 ha_dr_SAPHanaSR SFAIL 2018-05-04 12:53:06.316973 ha_dr_SAPHanaSR SOK [rh2adm]# grep ha_dr_ *
Note: For more information please check SAP doc Install and Configure a HA/DR Provider Script.
-
When the functionality of the hook has been verified the cluster can be started again.
[root]# pcs cluster start --all
4.3. Configure general cluster properties
To avoid unnecessary failovers of the resources during initial testing and post production, set the following default values for the resource-stickiness and migration-threshold parameters. Note that defaults do not apply to resources which override them with their own defined values.
[root]# pcs resource defaults resource-stickiness=1000
[root]# pcs resource defaults migration-threshold=5000
Notes:
1. It is sufficient to run the commands above on one node of the cluster.
2. Previous versions of this document recommended setting these defaults for the initial testing of the cluster setup, but removing them after production. Due to customer feedback and additional testing, it has been determined that it is beneficial to use these defaults for production cluster setups as well.
3. The command resource-stickiness=1000
will encourage the resource to stay running where it is, while migration-threshold=5000
will cause the resource to move to a new node after 5000 failures. 5000
is generally sufficient in preventing the resource from prematurely failing over to another node. This also ensures that the resource failover time stays within a controllable limit.
Previous versions of this guide recommended setting the no-quorum-policy
to ignore
, which is currently NOT supported. In the default configuration the no-quorum policy
property of the cluster does not need to be modified. To achieve the behavior provided by this option see Can I configure pacemaker to continue to manage resources after a loss of quorum in RHEL 6 or 7?
4.4. Create cloned SAPHanaTopology resource
SAPHanaTopology
resource gathers status and configuration of SAP HANA System Replication on each node. In addition, it starts and monitors the local SAP HostAgent
which is required for starting, stopping, and monitoring the SAP HANA instances. It has the following attributes:
Attribute Name | Required? | Default value | Description |
---|---|---|---|
SID | yes | null | The SAP System Identifier (SID) of the SAP HANA installation (must be identical for all nodes). Example: RH2 |
InstanceNumber | yes | null | The Instance Number of the SAP HANA installation (must be identical for all nodes). Example: 02 |
Below is an example command to create the SAPHanaTopology
cloned resource.
Note: the timeouts shown below for the resource operations are only examples and may need to be adjusted depending on the actual SAP HANA setup (for example large HANA databases can take longer to start up therefore the start timeout may have to be increased.)
[root]# pcs resource create SAPHanaTopology_RH2_02 SAPHanaTopology SID=RH2 InstanceNumber=02 \
op start timeout=600 \
op stop timeout=300 \
op monitor interval=10 timeout=600 \
clone clone-max=2 clone-node-max=1 interleave=true
Resulting resource should look like the following.
[root]# pcs resource show SAPHanaTopology_RH2_02-clone
Clone: SAPHanaTopology_RH2_02-clone
Meta Attrs: clone-max=2 clone-node-max=1 interleave=true
Resource: SAPHanaTopology_RH2_02 (class=ocf provider=heartbeat type=SAPHanaTopology)
Attributes: SID=RH2 InstanceNumber=02
Operations: start interval=0s timeout=600 (SAPHanaTopology_RH2_02-start-interval-0s)
stop interval=0s timeout=300 (SAPHanaTopology_RH2_02-stop-interval-0s)
monitor interval=10 timeout=600 (SAPHanaTopology_RH2_02-monitor-interval-10s)
Once the resource is started you will see the collected information stored in the form of node attributes that can be viewed with the command crm_mon -A1
. Below is an example of what attributes can look like when only SAPHanaTopology
is started.
[root]# crm_mon -A1
...
Node Attributes:
* Node node1:
+ hana_rh2_remoteHost : node2
+ hana_rh2_roles : 1:P:master1::worker:
+ hana_rh2_site : DC1
+ hana_rh2_srmode : syncmem
+ hana_rh2_vhost : node1
* Node node2:
+ hana_rh2_remoteHost : node1
+ hana_rh2_roles : 1:S:master1::worker:
+ hana_rh2_site : DC2
+ hana_rh2_srmode : syncmem
+ hana_rh2_vhost : node2
...
4.5. Create Master/Slave SAPHana resource
The SAPHana resource agent manages two SAP HANA instances (databases) that are configured in HANA System Replication.
Attribute Name | Required? | Default value | Description |
---|---|---|---|
SID | yes | null | The SAP System Identifier (SID) of the SAP HANA installation (must be identical for all nodes). Example: RH2 |
InstanceNumber | yes | null | The Instance Number of the SAP HANA installation (must be identical for all nodes). Example: 02 |
PREFER_SITE_TAKEOVER | no | null | Should resource agent prefer to switch over to the secondary instance instead of restarting primary locally? true : do prefer takeover to the secondary site; false : do prefer restart locally; never : under no circumstances do a takeover to the other node |
AUTOMATED_REGISTER | no | false | If a takeover event has occurred, and the DUPLICATE_PRIMARY_TIMEOUT has expired, should the former primary instance be registered as secondary? ("false": no, manual intervention will be needed; "true": yes, the former primary will be registered by resource agent as secondary) [1] |
DUPLICATE_PRIMARY_TIMEOUT | no | 7200 | The time difference (in seconds) needed between two primary time stamps, if a dual-primary situation occurs. If the time difference is less than the time gap, the cluster will hold one or both instances in a "WAITING" status. This is to give the system admin a chance to react to a takeover. After the time difference has passed, if AUTOMATED_REGISTER is set to true , the failed former primary will be registered as secondary. After the registration to the new primary, all data on the former primary will be overwritten by the system replication. |
[1] - As a good practice for test and PoC, we recommend to leave AUTOMATED_REGISTER at its default value (AUTOMATED_REGISTER="false") to prevent that a failed primary instance automatically registers as a secondary instance. After testing, if the failover scenarios work as expected, especially for production environment, we recommend to set AUTOMATED_REGISTER="true", so that after a takeover, the system replication will resume in a timely manner, to avoid disruption. When AUTOMATED_REGISTER="false", in case of a failure on the primary node, after investigation, you will need to manually register it as the secondary HANA System Replication node.
Note:
- the timeouts shown below for the resource operations are only examples and may need to be adjusted depending on the actual SAP HANA setup (for example large HANA databases can take longer to start up therefore the start timeout may have to be increased.)
4.5.1. RHEL 7.x
Below is an example command to create the SAPHana
Master/Slave resource.
[root]# pcs resource create SAPHana_RH2_02 SAPHana SID=RH2 InstanceNumber=02 \
PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false \
op start timeout=3600 \
op stop timeout=3600 \
op monitor interval=61 role="Slave" timeout=700 \
op monitor interval=59 role="Master" timeout=700 \
op promote timeout=3600 \
op demote timeout=3600 \
master meta notify=true clone-max=2 clone-node-max=1 interleave=true
RHEL 7.x, when running pcs-0.9.158-6.el7
, or newer, use the command below to avoid deprecation warning. More information about the change is explained in What are differences between master
and --master
option in pcs resource create
command?.
[root]# pcs resource create SAPHana_RH2_02 SAPHana SID=RH2 InstanceNumber=02 \
PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false \
op start timeout=3600 \
op stop timeout=3600 \
op monitor interval=61 role="Slave" timeout=700 \
op monitor interval=59 role="Master" timeout=700 \
op promote timeout=3600 \
op demote timeout=3600 \
master notify=true clone-max=2 clone-node-max=1 interleave=true
Resulting resource should look like the following.
[root]# pcs resource show SAPHana_RH2_02-master
Master: SAPHana_RH2_02-master
Meta Attrs: clone-max=2 clone-node-max=1 interleave=true notify=true
Resource: SAPHana_RH2_02 (class=ocf provider=heartbeat type=SAPHana)
Attributes: AUTOMATED_REGISTER=false DUPLICATE_PRIMARY_TIMEOUT=7200 InstanceNumber=02 PREFER_SITE_TAKEOVER=true SID=RH2
Operations: demote interval=0s timeout=3600 (SAPHana_RH2_02-demote-interval-0s)
methods interval=0s timeout=5 (SAPHana_RH2_02-methods-interval-0s)
monitor interval=61 role=Slave timeout=700 (SAPHana_RH2_02-monitor-interval-61)
monitor interval=59 role=Master timeout=700 (SAPHana_RH2_02-monitor-interval-59)
promote interval=0s timeout=3600 (SAPHana_RH2_02-promote-interval-0s)
start interval=0s timeout=3600 (SAPHana_RH2_02-start-interval-0s)
stop interval=0s timeout=3600 (SAPHana_RH2_02-stop-interval-0s)
Once the resource is started it will add additional node attributes describing the current state of SAP HANA databases on nodes as seen below.
[root]# crm_mon -A1
...
Node Attributes:
* Node node1:
+ hana_rh2_clone_state : PROMOTED
+ hana_rh2_op_mode : delta_datashipping
+ hana_rh2_remoteHost : node2
+ hana_rh2_roles : 4:S:master1:master:worker:master
+ hana_rh2_site : DC1
+ hana_rh2_sync_state : PRIM
+ hana_rh2_srmode : syncmem
+ hana_rh2_vhost : node1
+ lpa_rh2_lpt : 1495204085
+ master-hana : 150
* Node node2:
+ hana_rh2_clone_state : DEMOTED
+ hana_rh2_remoteHost : node1
+ hana_rh2_roles : 4:P:master1:master:worker:master
+ hana_rh2_site : DC2
+ hana_rh2_srmode : syncmem
+ hana_rh2_sync_state : SOK
+ hana_rh2_vhost : node2
+ lpa_rh2_lpt : 30
+ master-hana : 100
...
4.6. Create Virtual IP address resource
Cluster will contain Virtual IP address in order to reach the Master instance of SAP HANA. Below is example command to create IPaddr2
resource with IP 192.168.0.15
.
[root]# pcs resource create vip_RH2_02 IPaddr2 ip="192.168.0.15"
Resulting resource should look like one below.
[root]# pcs resource show vip_RH2_02
Resource: vip_RH2_02 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=192.168.0.15
Operations: start interval=0s timeout=20s (vip_RH2_02-start-interval-0s)
stop interval=0s timeout=20s (vip_RH2_02-stop-interval-0s)
monitor interval=10s timeout=20s (vip_RH2_02-monitor-interval-10s)
4.7. Create constraints
For correct operation we need to ensure that SAPHanaTopology
resources are started before starting the SAPHana
resources and also that the virtual IP address is present on the node where the Master resource of SAPHana
is running. To achieve this, the following 2 constraints need to be created.
4.7.1 RHEL 7.x
4.7.1.1 constraint - start SAPHanaTopology
before SAPHana
Example command below will create the constraint that mandates the start order of these resources. There are 2 things worth mentioning here:
symmetrical=false
attribute defines that we care only about thestart
of resources and they don't need to be stopped in reverse order.- Both resources (
SAPHana
andSAPHanaTopology
) have the attributeinterleave=true
that allows parallel start of these resources on nodes. This permits that despite of ordering we will not wait for all nodes to startSAPHanaTopology
but we can start theSAPHana
resource on any of nodes as soon asSAPHanaTopology
is running there.
Command for creating the constraint:
[root]# pcs constraint order SAPHanaTopology_RH2_02-clone then SAPHana_RH2_02-master symmetrical=false
The resulting constraint should look like the one in the example below.
[root]# pcs constraint
...
Ordering Constraints:
start SAPHanaTopology_RH2_02-clone then start SAPHana_RH2_02-master (kind:Mandatory) (non-symmetrical)
...
4.7.1.2 constraint - colocate the IPaddr2
resource with Master of SAPHana
resource
Below is an example command that will colocate the IPaddr2
resource with SAPHana
resource that was promoted as Master.
[root]# pcs constraint colocation add vip_RH2_02 with master SAPHana_RH2_02-master 2000
Note that the constraint is using a score of 2000 instead of the default INFINITY. This allows the IPaddr2
resource to be taken down by the cluster in case there is no Master promoted in the SAPHana
resource so it is still possible to use this address with tools like SAP Management Console or SAP LVM that can use this address to query the status information about the SAP Instance.
The resulting constraint should look like one in the example below.
[root]# pcs constraint
...
Colocation Constraints:
vip_RH2_02 with SAPHana_RH2_02-master (score:2000) (rsc-role:Started) (with-rsc-role:Master)
...
4.8. Adding a secondary virtual IP address for an Active/Active (Read-Enabled) HANA System Replication setup
Starting with SAP HANA 2.0 SPS1, SAP enables 'Active/Active (Read Enabled)' setups for SAP HANA System Replication, where the secondary systems of SAP HANA system replication can be used actively for read-intensive workloads. To be able to support such setups, a second virtual IP address is required, which enables clients to access the secondary SAP HANA database. To ensure that the secondary replication site can still be accessed after a takeover has occurred, the cluster needs to move the virtual IP address around with the slave of the master/slave SAPHana resource.
Note that when establishing HSR for the read-enabled secondary configuration, the operationMode
should be set to logreplay_readaccess
.
4.8.1. Creating the resource for managing the secondary virtual IP address
[root]# pcs resource create vip2_RH2_02 IPaddr2 ip="192.168.1.11"
Please use the appropriate resource agent for managing the IP address based on the platform on which the cluster is running.
4.8.2. Creating location constraints to ensure that the secondary virtual IP address is placed on the right cluster node
[root]# pcs constraint location vip2_RH2_02 rule score=INFINITY hana_rh2_sync_state eq SOK and hana_rh2_roles eq 4:S:master1:master:worker:master
[root]# pcs constraint location vip2_RH2_02 rule score=2000 hana_rh2_sync_state eq PRIM and hana_rh2_roles eq 4:P:master1:master:worker:master
These location constraints ensure that the second virtual IP resource will have the following behavior:
-
If there is a Master/PRIMARY node and a Slave/SECONDARY node, both available, with HANA System Replication as "SOK", the second virtual IP will run on the Slave/SECONDARY node.
-
If the Slave/SECONDARY node is not available or the HANA System Replication is not "SOK", the secondary virtual IP will run on the Master/PRIMARY node. When the Slave/SECONDARY will be available and the HANA System Replication will be "SOK" again, the second virtual IP will move back to the Slave/SECONDARY node.
-
If the Master/PRIMARY node is not available or the HANA instance running there has a problem, when the Slave/SECONDARY will take over the Master/PRIMARY role, the second virtual IP will continue running on the same node until the other node will take the Slave/SECONDARY role and the HANA System Replication will be "SOK".
This maximizes the time that the second virtual IP resource will be assigned to a node where a healthy SAP HANA instance is running.
4.9. Testing the manual move of SAPHana resource to another node (SAP Hana takeover by cluster)
Test moving the SAPHana
resource from one node to another
4.9.1. Moving SAPHana resource on RHEL 7
Use the command below on RHEL 7. Note that the option --master
should not be used when running the below command due to the way the SAPHana
resource works internally.
[root]# pcs resource move SAPHana_RH2_02-master
With each pcs resource move
command invocation, the cluster creates location constraints to cause the resource to move. These constraints must be removed after it has been verified that the HANA System Replication takeover has been completed in order to allow the cluster to manage the former primary HANA instance again. To remove the constraints created by the move
, run the command below.
[root]# pcs resource clear SAPHana_RH2_02-master
Comments