Following Failover of SAPHana resource, the new secondary takes a long time to start

Solution In Progress - Updated -

Issue

After the primary node fails and takes over, the secondary node takes some time to start and continuously reports monitor failures. These monitor failures trigger repeated restarts of SAPHana on the new secondary node, until we finally report "Register successful":

$ grep "Mar  9 07.*SAPHana_RH1_00" /var/log/messages
-----------------------------------------8<----------------------------------------- 

                                      1st loop, resource start

Mar  9 07:42:11 rhel8-node1 SAPHana(SAPHana_RH1_00)[11118]: INFO: RA ==== begin action start_clone (0.162.1) ====
Mar  9 07:42:13 rhel8-node1 SAPHana(SAPHana_RH1_00)[11118]: WARNING: RA: HANA_CALL stderr from command 'su - rh1adm -c' is 'logname: no login name', stderr from command 'python landscapeHostConfiguration.py' is ''
Mar  9 07:42:13 rhel8-node1 SAPHana(SAPHana_RH1_00)[11118]: INFO: DEC: hana_rh1_site_srHook_RH1HA00A=PRIM
Mar  9 07:42:13 rhel8-node1 SAPHana(SAPHana_RH1_00)[11118]: INFO: DEC: Finally get_SRHOOK()=PRIM
Mar  9 07:42:13 rhel8-node1 SAPHana(SAPHana_RH1_00)[11118]: INFO: LPA: landscape: DOWN, LPA: register ==> try to register
Mar  9 07:42:13 rhel8-node1 SAPHana(SAPHana_RH1_00)[11118]: INFO: DEC: ANOTHER HANA IS AVAILABLE ==> LETS REGISTER
Mar  9 07:42:13 rhel8-node1 pacemaker-attrd[9854]: notice: Setting master-SAPHana_RH1_00[rhel8-node1]: 5 -> 0
Mar  9 07:42:33 rhel8-node1 SAPHana(SAPHana_RH1_00)[11118]: INFO: RA ==== end action start_clone with rc=0 (0.162.1) (22s)====
Mar  9 07:42:33 rhel8-node1 pacemaker-controld[9856]: notice: Result of start operation for SAPHana_RH1_00 on rhel8-node1: ok
Mar  9 07:42:33 rhel8-node1 pacemaker-controld[9856]: notice: Requesting local execution of monitor operation for SAPHana_RH1_00 on rhel8-node1

                                1st loop, resource monitor fails and we restart:

Mar  9 07:42:33 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: INFO: RA ==== begin action monitor_clone (0.162.1) ====
Mar  9 07:42:33 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: INFO: DEC: hana_rh1_site_srHook_RH1HA00A=PRIM
Mar  9 07:42:33 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: INFO: DEC: Finally get_SRHOOK()=PRIM
Mar  9 07:42:35 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: INFO: ACT: systemd service SAPRH1_00.service is active
Mar  9 07:42:35 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: ERROR: SAP instance service hdbdaemon status color is GRAY !
Mar  9 07:42:35 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: WARNING: RA: HANA_CALL stderr from command 'su - rh1adm -c' is 'logname: no login name', stderr from command 'python landscapeHostConfiguration.py' is ''
Mar  9 07:42:35 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: INFO: DEC: PRIMDEFECT (in DEMOTED status)
Mar  9 07:42:35 rhel8-node1 SAPHana(SAPHana_RH1_00)[12139]: INFO: RA ==== end action monitor_clone with rc=7 (0.162.1) (2s)====
Mar  9 07:42:35 rhel8-node1 pacemaker-controld[9856]: notice: Result of monitor operation for SAPHana_RH1_00 on rhel8-node1: not running <--- reported "not running"


-----------------------------------------8<----------------------------------------- 
                                        Above pattern repeats until,
                                Last loop and reported successful register:

Mar  9 07:51:30 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: RA ==== begin action start_clone (0.162.1) ====
Mar  9 07:51:32 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: WARNING: RA: HANA_CALL stderr from command 'su - rh1adm -c' is 'logname: no login name'
, stderr from command 'python landscapeHostConfiguration.py' is ''
Mar  9 07:51:32 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: DEC: hana_rh1_site_srHook_RH1HA00A=PRIM
Mar  9 07:51:32 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: DEC: Finally get_SRHOOK()=PRIM
Mar  9 07:51:32 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: LPA: landscape: DOWN, LPA: register ==> try to register
Mar  9 07:51:32 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: DEC: ANOTHER HANA IS AVAILABLE ==> LETS REGISTER
Mar  9 07:51:32 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: ACT: REGISTER: hdbnsutil -sr_register --remoteHost=rhel8-node2 --remoteInstance=00 --replicationMode=syncmem --operationMode=logreplay --name=RH1HA00A
Mar  9 07:51:35 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: ACT: Register successful <---
Mar  9 07:51:36 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: ACT: systemd service SAPRH1_00.service is active
Mar  9 07:51:36 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: DEC: hana_rh1_site_srHook_RH1HA00A=PRIM
Mar  9 07:51:36 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: ACT: Starting SAPHANA Instance RH1-HDB00: #01209.09.2024 07:51:36#012Start#012OK
Mar  9 07:52:28 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: ACT: SAPHANA Instance RH1-HDB00 started: #01209.09.2024 07:52:28#012WaitforStarted#012OK
Mar  9 07:52:28 rhel8-node1 SAPHana(SAPHana_RH1_00)[46280]: INFO: RA ==== end action start_clone with rc=0 (0.162.1) (58s)====
Mar  9 07:52:28 rhel8-node1 pacemaker-controld[9856]: notice: Result of start operation for SAPHana_RH1_00 on rhel8-node1: ok
Mar  9 07:52:28 rhel8-node1 pacemaker-controld[9856]: notice: Requesting local execution of monitor operation for SAPHana_RH1_00 on rhel8-node1


Mar  9 07:52:28 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: RA ==== begin action monitor_clone (0.162.1) ====
Mar  9 07:52:28 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: hana_rh1_site_srHook_RH1HA00A=SOK
Mar  9 07:52:28 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: Finally get_SRHOOK()=SOK
Mar  9 07:52:29 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: ACT: systemd service SAPRH1_00.service is active
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: hana_rh1_site_srHook_RH1HA00A=SOK
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: Finally get_SRHOOK()=SOK
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: secondary with sync status SOK ==> possible takeover node
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: hana_rh1_site_srHook_RH1HA00A=SOK
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: Finally get_SRHOOK()=SOK
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: saphana_monitor_secondary: scoring_crm_master(4:S:master1:master:worker:master,SOK)
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: scoring_crm_master: roles(4:S:master1:master:worker:master) are matching pattern ([234]*:S:[^:]*:master)
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: scoring_crm_master: sync(SOK) is matching syncPattern (SOK)
Mar  9 07:52:31 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: DEC: scoring_crm_master: set score 100
Mar  9 07:52:32 rhel8-node1 pacemaker-attrd[9854]: notice: Setting master-SAPHana_RH1_00[rhel8-node1]: 0 -> 100
Mar  9 07:52:32 rhel8-node1 SAPHana(SAPHana_RH1_00)[51687]: INFO: RA ==== end action monitor_clone with rc=0 (0.162.1) (4s)====

Environment

  • Red Hat Enterprise Linux 7, 8, 9 with High Availability Add-On
  • SAPHana resource agent

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content