Chapter 25. Clustering

Pacemaker correctly implements fencing and unfencing for Pacemaker remote nodes

Previously, Pacemaker did not implement unfencing for Pacemaker remote nodes. As a consequence, Pacemaker remote nodes remained fenced even if a fence device required unfencing. With this update, Pacemaker correctly implements both fencing and unfencing for Pacemaker remote nodes, and the described problem no longer occurs. (BZ#1394418)

Pacemaker now probes guest nodes

Important update for users of guest nodes.
Pacemaker now probes guest nodes, which are Pacemaker remote nodes created using the remote-node parameter of a resource such as VirtualDomain. If users were previously relying on the fact that probes were not done, the probes may fail, potentially causing fencing of the guest node. If a guest node cannot run a probe of a resource (for example, if the software is not even installed on the guest), then the location constraint banning the resource from the guest node should have the resource-discovery option set to never, the same as would be required with a cluster node or remote node in the same situation. (BZ#1489728)

The pcs resource cleanup command no longer generates unnecessary cluster load

The pcs resource cleanup command cleans the records of failed resource operations that have been resolved. Previously, the command probed all resources on all nodes, generating an unnecessary load on cluster operation. With this fix, the command probes only the resources for which a resource operation failed. The previous functionality of the pcs resource cleanup command has been replaced by the new pcs resource refresh command, which probes all resources on all nodes. For information on cluster resource cleanup, see https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/high_availability_add-on_reference/#s1-resource_cleanup-HAAR. (BZ#1508351)

Warning generated when user specifies action attribute for stonith device

Previously, it was possible for a user to set an action attribute for a stonith device, even though this option is deprecated and is not recommended as it can cause unexpected fencing. The following fixes have been implemented:
  • When a user tries to set an action option of a stonith device with the CLI, this generates a warning message along with the instructions to use the --force flag to set this attribute.
  • The pcsd Web UI now displays a warning message next to action option field.
  • The output of the pcs status command displays a warning when a stonith device has the action option set. (BZ#1421702)

It is now possible to enable stonith agent debugging without specifying the --force flag

Previously, attempting to enable debugging of a stonith agent by setting the debug or verbose parameters required that the user specify the --force flag. With this fix, using the --force flag is no longer necessary. (BZ#1432283)

The fence_ilo3 resource agent no longer has a default value of cycle for the action parameter

Previously, the fence_ilo3 resource agent had a default value of cycle for the action parameter. This value is unsupported, as it may cause data corruption. The default value for this parameter is now onoff. Additionally, a warning is now displayed in the output of the pcs status command and the web UI if a stonith device has its method option set to cycle. (BZ#1519370, BZ#1523378)

Pacemaker no longer starts up when sbd is enabled but not started successfully by systemd

Previously, if sbd did not start properly, systemd would still start Pacemaker. This would lead to sbd poison pill triggered reboots not being performed without this being detected by fence_sbd and, in the case of quorum-based watchdog fencing, the nodes losing quorum would not self-fence either. With this fix, if sbd does not come up properly Pacemaker is not started. This should prevent all sources of data curruption due to sbd not coming up. (BZ#1525981)

A fenced node in an ‘sbd’ setup now shuts down reliably

Previously, when a node received an ‘off’ via the poison pill mechanism used by ‘sbd’ on a shared disk, the node would be likely to reboot instead of powering off. With this fix, receiving an ‘off’ will power off the node. Receiving a ‘reset’ will reboot the node. If the node is not able to perform the software-driven reboot or power off properly, the watchdog is going to trigger and the action performed is what the watchdog device is configured to. A fenced node in an ‘sbd’ setup now shuts down reliably if the watchdog device is configured to power off the node, and fencing is requesting ‘off’ via the poison pill mechanism on a shared disk. (BZ#1468580)