The service fail time detected by clustat is different from logged failure time in /var/log/messages on Red Hat Enterprise Linux
Issue
With command Clustat -i 1
:
Wed Jun 13 03:45:17 KST 2012
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:was-tomcat5 (node1.rhcs.com) recoverable
Wed Jun 13 03:45:28 2012
service:was-tomcat5 node1.rhcs.com starting
Wed Jun 13 03:45:31 KST 2012
service:was-tomcat5 node1.rhcs.com started
In the file /var/log/message
:
Service fail time:
Jun 13 03:45:16 node1 clurgmgrd: [20561]: <err> script:tomcat5: status of /data/nimadmin/tomcat/bin/tomcat5-rhcs.sh failed (returned 1)
Jun 13 03:45:16 node1 clurgmgrd: [20561]: <err> script:tomcat5: status of /data/nimadmin/tomcat/bin/tomcat5-rhcs.sh failed (returned 1)
Jun 13 03:45:16 node1 clurgmgrd[20561]: <notice> status on script "tomcat5" returned 1 (generic error)
Jun 13 03:45:16 node1 clurgmgrd[20561]: <notice> status on script "tomcat5" returned 1 (generic error)
Jun 13 03:45:16 node1 clurgmgrd[20561]: <notice> Stopping service service:was-tomcat5
Jun 13 03:45:16 node1 clurgmgrd[20561]: <notice> Stopping service service:was-tomcat5
Jun 13 03:45:16 node1 su: PAM unable to dlopen(/lib/security/pam_wheel.so)
Jun 13 03:45:16 node1 su: PAM [error: /lib/security/pam_wheel.so: wrong ELF class: ELFCLASS32]
Jun 13 03:45:16 node1 su: PAM adding faulty module: /lib/security/pam_wheel.so
Jun 13 03:45:17 node1 logger: Still WAS Process is running.
Jun 13 03:45:17 node1 logger: Still WAS Process is running.
service recovering time:
Jun 13 03:45:27 node1 clurgmgrd[20561]: <notice> Service service:was-tomcat5 is recovering
Jun 13 03:45:27 node1 clurgmgrd[20561]: <notice> Service service:was-tomcat5 is recovering
Jun 13 03:45:27 node1 clurgmgrd[20561]: <notice> Recovering failed service service:was-tomcat5
Jun 13 03:45:27 node1 clurgmgrd[20561]: <notice> Recovering failed service service:was-tomcat5
service starting time:
Jun 13 03:45:30 node1 logger: WAS Process is started.
Jun 13 03:45:30 node1 logger: WAS Process is started.
Jun 13 03:45:30 node1 clurgmgrd[20561]: <notice> Service service:was-tomcat5 started
Jun 13 03:45:30 node1 clurgmgrd[20561]: <notice> Service service:was-tomcat5 started
We can found the time in log is 1 second faster than the time we found with command cluster -i 1
Environment
- Red Hat Enterprise Linux (RHEL), including:
- Red Hat Enterprise Linux Server 5 (with the High Availability Add on)
- Red Hat Enterprise Linux Server 6 (with the High Availability Add on)
- Red Hat Cluster with 2 or more nodes
- Services managed by rgmanager (clurgmgrd)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.