Cluster service fails when resources using __enforce_timeouts encounter their timeout in a RHEL 5 or 6 High Availability cluster with rgmanager
Issue
- A cluster service is timing out on stop and then it is listed as "Failed" and doesn't relocate to another node
- The script that checks the file system gave timeout and after this the application stopped
Apr 08 17:08:12 rgmanager status on fs:myFS timed out after 30 seconds
Apr 08 17:08:13 rgmanager Stopping service service:myService
Apr 08 17:08:13 rgmanager [script] Executing /usr/local/bin/myApp.sh stop
Apr 08 17:10:13 rgmanager stop on script:myApp timed out after 120 seconds
Apr 08 17:10:55 rgmanager stop on fs:myFS timed out after 30 seconds
Apr 08 17:12:29 rgmanager #12: RG service:myService failed to stop; intervention required
- Resources configured with
__enforce_timeouts
don't relocate when they reach a time out
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add On
rgmanager
- One or more resources in
/etc/cluster/cluster.conf
configured with__enforce_timeouts="1"
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.