Support Policies for RHEL High Availability Clusters - sbd and fence_sbd

Updated -

Contents

Overview

Applicable Environments

  • Red Hat Enterprise Linux (RHEL) with the High Availability Add-On

Useful References and Guides

Introduction

This guide offers Red Hat's policies and requirements around the usage of the fencing components sbd and fence_sbd for RHEL High Availability clusters. Users of RHEL High Availability clusters should adhere to these policies in order to be eligible for support from Red Hat with the appropriate product support subscriptions.

Policies

Available/Supported Releases sbd is available and supported for usage in RHEL HA clusters under the following conditions:

  • RHEL 8: Supported by Red Hat
  • RHEL 7: Supported by Red Hat with RHEL 7.1 or later - using pacemaker-1.1.12-22.el7 or later and sbd-1.2.1-3 or later.
  • RHEL 6: Supported by Red Hat with RHEL 6.8 or later - using pacemaker-1.1.14-8.el6 or later and sbd-1.2.1-9.el6 or later.

Storage compatibility sbd poison-pill fencing via block-device: Red Hat's typical storage compatibility policies for RHEL High Availability apply to sbd - in that Red Hat does not certify sbd with specific storage solutions. It is up to customer organizations using sbd poison-pill fencing via block-device to ensure their storage solution is adequate for shared access across cluster members through all relevant failure scenarios the cluster may need to endure.


Storage device requirements for sbd poison-pill fencing via block-device:

  • A block device must be shared by all nodes of the cluster
  • All nodes must have write access to the shared device.
  • The shared block device cannot host a file system or be used by any other component.
  • The shared block device cannot be managed by LVM or incorporated into an LVM volume group.
  • The shared block device should not be managed by any RAID, mirroring, or replication that is administered by the cluster members themselves. Any replication employed must be transparent to the hosts.
  • 4 Megabytes in size is sufficient for the shared block device.

Storage replication technologies with sbd poison-pill fencing via block-device: Red Hat does not test specific vendor storage replication technologies that may facilitate cluster-wide access to the block device sbd poison-pill fencing via block-device. While Red Hat does not prevent organizations from using such technologies, it cannot guarantee that sbd will function as expected with all of them. It is important for organizations using these technologies to thoroughly test sbd in conjunction with their chosen solutions, especially through all potential failure scenarios that may create a split between nodes or storage arrays. Storage vendors should be consulted for input on the capabilities of their solutions and how they may perform in connection with RHEL High Availability or sbd.


Support with pacemaker only: sbd is only available and supported for usage when pacemaker is in use. sbd is suitable for usage in RHEL 6 clusters with cman and pacemaker, as well as RHEL 7 and 8 pacemaker clusters. sbd is not supported for usage with 6 pure-cman clusters that do not use pacemaker.


No support for sbd on pacemaker_remote nodes: sbd is not supported on pacemaker remote nodes - either baremetal or virtual-machine nodes that utilize pacemaker_remote. Prior to RHEL 8.5, to use a watchdog-only SBD configuration, all nodes in the cluster had to use SBD. That prevented using SBD in a cluster where some nodes support it but other nodes (often remote nodes) required some other form of fencing. On RHEL 8.5 or later a cluster that has pacemaker remote nodes can utilize sbd on the cluster nodes (not the remote nodes).


Suitable watchdog timer (WDT) devices: Red Hat does not maintain a list of supported hardware watchdog timer devices that can be used with sbd. Red Hat considers a device to be suitable for usage with sbd if:

  • The device is not amongst those specifically highlighted in this policy guide as unsupported
  • That device's driver implements the structures, functions, and functionality that are expected of a watchdog timer device as explained within the kernel documentation. Red Hat would consider this true for drivers shipped within a RHEL-supplied kernel, other than softdog.
  • The device is guaranteed to abruptly halt the system if the device is not updated within the timeout period specified to the watchdog device using the kernel's watchdog API.
  • The device can be demonstrated to carry out this halting action under the expected circumstances.

Support for virtualization-emulated watchdogs: Red Hat supports usage of sbd with a libvirt/KVM-provided hardware watchdog using emulated model i6300esb. Red Hat does not provide support or testing for use of any other emulated watchdog devices (e.g. VMWare). However, if there is interest in another model or device on a different virtualization platform, please contact Red Hat Support to communicate this interest.

  • SBD is not supported on VMware. The reason that SBD is not supported in a VMware environment is due to the potential for data corruption if a watchdog does not fire. There can be certain circumstances where the VMware watchdog may not fire (live migration, disk IO, other software issues) which can cause a split brain situation where a node may incorrectly determine that the other nodes have been powered off and attempt to recover services leading to data corruption or other non-desirable behavior.

No Support for Software-Emulated Watchdog: This watchdog timer device used by sbd cannot be emulated by software, such as is done by the softdog driver. Such programs operate within the limitations and available resources provided by the kernel, and thus cannot be guaranteed to carry out the necessary halting action if the operating system is malfunctioning or starved of resources.


Investigations may require confirmation of a suitable device: Because Red Hat does not maintain a list of suitable devices for usage with sbd and thus may not have extensive knowledge of every device type, questions may arise in the course of investigations as to the suitability of the device in use. If unexpected behavior is observed with sbd, Red Hat may request and require demonstration of the functionality of the watchdog device in use before it can deeply investigate. In situations where the functionality of the device itself is questionable, Red Hat may require the concerning behavior from sbd be reproduced on known hardware, or may require investigation from the vendor of the device.


RHEL 6 with QDisk: RHEL 6 clusters utilizing QDisk are not compatible with or supported with sbd


sbd compatibility with cloud platforms: Red Hat does not provide support for sbd or fence_sbd on any of the various cloud-platforms where RHEL High Availability is available.

As noted in other policies above, sbd is only supported with a suitable hardware watchdog timer - which none of the supported cloud platforms offer at this time - and sbd is not supported with softdog.

The sole exception is SAP Hana on Azure (Large Instances). These are not cloud VMs but rather bare-metal systems managed through the Azure interface. For sbd to be supported on this platform, an IPMI watchdog must be configured and enabled on each system as documented in the the Configure Watchdog section of the Microsoft documentation. See also:

Comments