Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

Appendix A. OCF Return Codes

This appendix describes the OCF return codes and how they are interpreted by Pacemaker.
The first thing the cluster does when an agent returns a code is to check the return code against the expected result. If the result does not match the expected value, then the operation is considered to have failed, and recovery action is initiated.
For any invocation, resource agents must exit with a defined return code that informs the caller of the outcome of the invoked action.
There are three types of failure recovery, as described in Table A.1, “Types of Recovery Performed by the Cluster”.

Table A.1. Types of Recovery Performed by the Cluster

TypeDescriptionAction Taken by the Cluster
soft
A transient error occurred.
Restart the resource or move it to a new location .
hard
A non-transient error that may be specific to the current node occurred.
Move the resource elsewhere and prevent it from being retried on the current node.
fatal
A non-transient error that will be common to all cluster nodes occurred (for example, a bad configuration was specified).
Stop the resource and prevent it from being started on any cluster node.
Table A.2, “OCF Return Codes” provides The OCF return codes and the type of recovery the cluster will initiate when a failure code is received. Note that even actions that return 0 (OCF alias OCF_SUCCESS) can be considered to have failed, if 0 was not the expected return value.

Table A.2. OCF Return Codes

Return CodeOCF LabelDescription
0
OCF_SUCCESS
The action completed successfully. This is the expected return code for any successful start, stop, promote, and demote command.
Type if unexpected: soft
1
OCF_ERR_GENERIC
The action returned a generic error.
Type: soft
The resource manager will attempt to recover the resource or move it to a new location.
2
OCF_ERR_ARGS
The resource’s configuration is not valid on this machine. For example, it refers to a location not found on the node.
Type: hard
The resource manager will move the resource elsewhere and prevent it from being retried on the current node
3
OCF_ERR_UNIMPLEMENTED
The requested action is not implemented.
Type: hard
4
OCF_ERR_PERM
The resource agent does not have sufficient privileges to complete the task. This may be due, for example, to the agent not being able to open a certain file, to listen on a specific socket, or to write to a directory.
Type: hard
Unless specifically configured otherwise, the resource manager will attempt to recover a resource which failed with this error by restarting the resource on a different node (where the permission problem may not exist).
5
OCF_ERR_INSTALLED
A required component is missing on the node where the action was executed. This may be due to a required binary not being executable, or a vital configuration file being unreadable.
Type: hard
Unless specifically configured otherwise, the resource manager will attempt to recover a resource which failed with this error by restarting the resource on a different node (where the required files or binaries may be present).
6
OCF_ERR_CONFIGURED
The resource’s configuration on the local node is invalid.
Type: fatal
When this code is returned, Pacemaker will prevent the resource from running on any node in the cluster, even if the service configuraiton is valid on some other node.
7
OCF_NOT_RUNNING
The resource is safely stopped. This implies that the resource has either gracefully shut down, or has never been started.
Type if unexpected: soft
The cluster will not attempt to stop a resource that returns this for any action.
8
OCF_RUNNING_MASTER
The resource is running in master mode.
Type if unexpected: soft
9
OCF_FAILED_MASTER
The resource is in master mode but has failed.
Type: soft
The resource will be demoted, stopped and then started (and possibly promoted) again.
other
N/A
Custom error code.