Server rebooted abruptly on IBM's Reliable Scalable Cluster

Solution Unverified - Updated -

Issue

  • Why did Server rebooted on IBM's Reliable Scalable Cluster Technology (RSCT) ?
02:55:37  ConfigRM[4334]: (Recorded using libct_ffdc.a cv 2):::Error ID: :::Reference ID:
  :::Template ID: 0:::Details File:  :::Location: RSCT,ConfigRMGroup.C,1.305,770
  :::CONFIGRM_MERGE_ST The sub-domain containing the local node is being dissolved because another
  sub-domain has been detected that takes precedence over it.  Group services  will be ended on each
  node of the local sub-domain which will cause the  configuration manager daemon (IBM.ConfigRMd) to
  force the node offline and  then bring it back online in the surviving domain.

02:55:37 cthags[4660]: (Recorded using libct_ffdc.a cv 2):::Error ID:825....7CyjG/jjX.DuhHu.
:::Reference ID:  :::Template ID: 0:::Details File:  :::Location: RSCT,NS.C,1.XX.1.VV,4755 
:::GS_DOM_MERGE_ER Group Services daemon exit to merge domains DIAGNOSTIC EXPLANATION NS::Ack():
 The master requests to dissolve my domain because of the merge with other domain 1.49 

02:55:37  RMCdaemon[4286]: (Recorded using libct_ffdc.a cv 2):::Error ID: 822....7CyjG/QeK
/DuhHu....................:::Reference ID:  :::Template ID: 0:::Details File:  :::Location:
RSCT,rmcd_gsi.c,1.50,1048                     :::RMCD_2610_101_ER Internal error. 
Error data 1 00000001 Error data 2 00000000 Error data 3 dispatch_gs 

02:55:37  ConfigRM[4334]: (Recorded using libct_ffdc.a cv 2):::Error ID: :::Reference ID:
:::Template ID: 0:::Details File:  :::Location: RSCT,ConfigRMGroup.C,1.305,5264 
:::CONFIGRM_EXIT_GS_ER The peer domain configuration manager daemon (IBM.ConfigRMd) is exiting 
due to  the Group Services subsystem terminating.  The configuration manager daemon will  restart
 automatically, synchronize the nodes configuration with the domain and  rejoin the domain if possible.

02:55:37  ConfigRM[4334]: (Recorded using libct_ffdc.a cv 2):::Error ID: :::Reference ID:
  :::Template ID: 0:::Details File:  :::Location: RSCT,PeerDomain.C,1.RR.22.XX,20865   :::CONFIGRM_REBOOTOS_ER
 The operating system is being rebooted to ensure that critical resources are  stopped so that another 
sub-domain that has operational quorum may recover  these resources without causing corruption or conflict. 

03:01:00 syslogd 1.4.1: restart.
03:01:01  kernel: klogd 1.4.1, log source = /proc/kmsg started.

Environment

  • Red Hat Enterprise Linux
  • IBM's Reliable Scalable Cluster Technology

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In