11.2. Event Notification with Monitoring Resources

The ocf:pacemaker:ClusterMon resource can monitor the cluster status and trigger alerts on each cluster event. This resource runs the crm_mon command in the background at regular intervals.
By default, the crm_mon command listens for resource events only; to enable listing for fencing events you can provide the --watch-fencing option to the command when you configure the ClusterMon resource. The crm_mon command does not monitor for membership issues but will print a message when fencing is started and when monitoring is started for that node, which would imply that a member just joined the cluster.
The ClusterMon resource can execute an external program to determine what to do with cluster notifications by means of the extra_options parameter. Table 11.3, “Environment Variables Passed to the External Monitor Program” lists the environment variables that are passed to that program, which describe the type of cluster event that occurred.

Table 11.3. Environment Variables Passed to the External Monitor Program

Environment VariableDescription
CRM_notify_recipient
The static external-recipient from the resource definition
CRM_notify_node
The node on which the status change happened
CRM_notify_rsc
The name of the resource that changed the status
CRM_notify_task
The operation that caused the status change
CRM_notify_desc
The textual output relevant error code of the operation (if any) that caused the status change
CRM_notify_rc
The return code of the operation
CRM_target_rc
The expected return code of the operation
CRM_notify_status
The numerical representation of the status of the operation
The following example configures a ClusterMon resource that executes the external program crm_logger.sh which will log the event notifications specified in the program.
The following procedure creates the crm_logger.sh program that this resource will use.
  1. On one node of the cluster, create the program that will log the event notifications.
    # cat <<-END >/usr/local/bin/crm_logger.sh
    #!/bin/sh
    logger -t "ClusterMon-External" "${CRM_notify_node} ${CRM_notify_rsc} \
    ${CRM_notify_task} ${CRM_notify_desc} ${CRM_notify_rc} \
    ${CRM_notify_target_rc} ${CRM_notify_status} ${CRM_notify_recipient}";
    exit;
    END
  2. Set the ownership and permissions for the program.
    # chmod 700 /usr/local/bin/crm_logger.sh
    # chown root.root /usr/local/bin/crm_logger.sh
  3. Use the scp command to copy the crm_logger.sh program to the other nodes of the cluster, putting the program in the same location on those nodes and setting the same ownership and permissions for the program.
The following example configures the ClusterMon resource, named ClusterMon-External, that runs the program /usr/local/bin/crm_logger.sh. The ClusterMon resource outputs the cluster status to an html file, which is /var/www/html/cluster_mon.html in this example. The pidfile detects whether ClusterMon is already running; in this example that file is /var/run/crm_mon-external.pid. This resource is created as a clone so that it will run on every node in the cluster. The watch-fencing is specified to enable monitoring of fencing events in addition to resource events, including the start/stop/monitor, start/monitor. and stop of the fencing resource.
# pcs resource create ClusterMon-External ClusterMon user=root \
update=10 extra_options="-E /usr/local/bin/crm_logger.sh --watch-fencing" \
htmlfile=/var/www/html/cluster_mon.html \
pidfile=/var/run/crm_mon-external.pid clone

Note

The crm_mon command that this resource executes and which could be run manually is as follows:
# /usr/sbin/crm_mon -p /var/run/crm_mon-manual.pid -d -i 5 \
-h /var/www/html/crm_mon-manual.html -E "/usr/local/bin/crm_logger.sh" \
--watch-fencing
The following example shows the format of the output of the monitoring notifications that this example yields.
Aug  7 11:31:32 rh6node1pcmk ClusterMon-External: rh6node2pcmk.examplerh.com ClusterIP st_notify_fence Operation st_notify_fence requested by rh6node1pcmk.examplerh.com for peer rh6node2pcmk.examplerh.com: OK (ref=b206b618-e532-42a5-92eb-44d363ac848e) 0 0 0 #177
Aug  7 11:31:32 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com ClusterIP start OK 0 0 0
Aug  7 11:31:32 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com ClusterIP monitor OK 0 0 0
Aug  7 11:33:59 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com fence_xvms monitor OK 0 0 0
Aug  7 11:33:59 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com ClusterIP monitor OK 0 0 0
Aug  7 11:33:59 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com ClusterMon-External start OK 0 0 0
Aug  7 11:33:59 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com fence_xvms start OK 0 0 0
Aug  7 11:33:59 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com ClusterIP start OK 0 0 0
Aug  7 11:33:59 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com ClusterMon-External monitor OK 0 0 0
Aug  7 11:34:00 rh6node1pcmk crmd[2887]:   notice: te_rsc_command: Initiating action 8: monitor ClusterMon-External:1_monitor_0 on rh6node2pcmk.examplerh.com
Aug  7 11:34:00 rh6node1pcmk crmd[2887]:   notice: te_rsc_command: Initiating action 16: start ClusterMon-External:1_start_0 on rh6node2pcmk.examplerh.com
Aug  7 11:34:00 rh6node1pcmk ClusterMon-External: rh6node1pcmk.examplerh.com ClusterIP stop OK 0 0 0
Aug  7 11:34:00 rh6node1pcmk crmd[2887]:   notice: te_rsc_command: Initiating action 15: monitor ClusterMon-External_monitor_10000 on rh6node2pcmk.examplerh.com
Aug  7 11:34:00 rh6node1pcmk ClusterMon-External: rh6node2pcmk.examplerh.com ClusterMon-External start OK 0 0 0
Aug  7 11:34:00 rh6node1pcmk ClusterMon-External: rh6node2pcmk.examplerh.com ClusterMon-External monitor OK 0 0 0
Aug  7 11:34:00 rh6node1pcmk ClusterMon-External: rh6node2pcmk.examplerh.com ClusterIP start OK 0 0 0
Aug  7 11:34:00 rh6node1pcmk ClusterMon-External: rh6node2pcmk.examplerh.com ClusterIP monitor OK 0 0 0