Appendix E. Cluster Service Resource Check and Failover Timeout

This appendix describes how rgmanager monitors the status of cluster resources, and how to modify the status check interval. The appendix also describes the __enforce_timeouts service parameter, which indicates that a timeout for an operation should cause a service to fail.

Note

To fully comprehend the information in this appendix, you may require detailed understanding of resource agents and the cluster configuration file, /etc/cluster/cluster.conf. For a comprehensive list and description of cluster.conf elements and attributes, refer to the cluster schema at /usr/share/system-config-cluster/misc/cluster.ng, and the annotated schema at /usr/share/doc/system-config-cluster-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/system-config-cluster-1.0.57/cluster_conf.html).

E.1. Modifying the Resource Status Check Interval

rgmanager checks the status of individual resources, not whole services. (This is a change from clumanager on Red Hat Enterprise Linux 3, which periodically checked the status of the whole service.) Every 10 seconds, rgmanager scans the resource tree, looking for resources that have passed their "status check" interval.
Each resource agent specifies the amount of time between periodic status checks. Each resource utilizes these timeout values unless explicitly overridden in the cluster.conf file using the special <action> tag:
<action name="status" depth="*" interval="10" />
This tag is a special child of the resource itself in the cluster.conf file. For example, if you had a file system resource for which you wanted to override the status check interval you could specify the file system resource in the cluster.conf file as follows:

  <fs name="test" device="/dev/sdb3">
    <action name="status" depth="*" interval="10" />
    <nfsexport...>
    </nfsexport>
  </fs>
Some agents provide multiple "depths" of checking. For example, a normal file system status check (depth 0) checks whether the file system is mounted in the correct place. A more intensive check is depth 10, which checks whether you can read a file from the file system. A status check of depth 20 checks whether you can write to the file system. In the example given here, the depth is set to *, which indicates that these values should be used for all depths. The result is that the test file system is checked at the highest-defined depth provided by the resource-agent (in this case, 20) every 10 seconds.