Cluster service containing a script on shared storage fails but does not recover or relocate in RHEL 5
Issue
- We have a cluster service with a script resource on shared storage. If that service fails, stops as a result, attempts to recover by restarting, and fails again, that service will never relocate to another node.
-
Service fails on recovery when attempting to stop a script resource:
Dec 16 10:39:48 node1 clurgmgrd[29921]: <notice> Recovering failed service service:script-test Dec 16 10:39:48 node1 clurgmgrd[29921]: <notice> start on ip "192.168.122.233" returned 1 (generic error) Dec 16 10:39:48 node1 clurgmgrd[29921]: <warning> #68: Failed to start service:script-test; return value: 1 Dec 16 10:39:48 node1 clurgmgrd[29921]: <notice> Stopping service service:script-test Dec 16 10:39:48 node1 clurgmgrd[29921]: <notice> stop on script "script" returned 5 (program not installed) Dec 16 10:39:48 node1 clurgmgrd: [29921]: <info> /dev/mapper/mpath1p1 is not mounted Dec 16 10:39:48 node1 clurgmgrd[29921]: <crit> #12: RG service:script-test failed to stop; intervention required Dec 16 10:39:48 node1 clurgmgrd[29921]: <notice> Service service:script-test is failed Dec 16 10:39:48 node1 clurgmgrd[29921]: <crit> #13: Service service:script-test failed to stop cleanly
Environment
- Red Hat Enterprise Linux (RHEL) 5 Advanced Platform with Clustering
- rgmanager prior to release 2.0.52-28.el5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.