Fencing fails in a RHEL 7, 8, 9 High Availability cluster because systemd initiates a graceful shutdown

Solution Verified - Updated 2024-08-02T06:49:40+00:00 -

Issue

fencing fails because systemd-logind handles the "power button" signal and initiates a graceful shutdown instead of powercycling the system.
When a node fenced the other, we see that node process a power-button press and starts to shut down. All the while, fencing fails on the other node, seemingly for taking too long
Do we need to disable acpi / acpid in RHEL 7 clusters like we did in previous releases?
Do I need to do anything in addition to disabling ACPI on RHEL 7 cluster nodes to avoid it softly shutting down? For example:

Aug 13 21:07:22 node01 systemd-logind: Power key pressed. 
Aug 13 21:07:22 node01 systemd-logind: Powering Off...
Aug 13 21:07:22 node01 systemd-logind: System is powering down.
Aug 13 21:07:42 node02 stonith-ng[2803]: notice: log_operation: Operation 'reboot' [3114] for device 'node01-ilo' returned: -62 (Timer expired)

A cluster node gracefully rebooted instead of being hard killed on RHEL 7:

Nov  2 10:57:01 node41 stonith-ng[8161]:  notice: Operation reboot of node42 by node42 for crmd.20238@uxplpsgrd03.8b66209c: OK
Nov  2 10:57:01 node42 crmd[20238]:    crit: We were allegedly just fenced by node41 for node42!
Nov  2 10:57:01 node42 stonith-ng[20234]:  notice: Operation reboot of node42 by node41 for crmd.20238@node42.8b66209c: OK
Nov  2 10:57:01 node42 systemd-logind: Power key pressed.

A cluster node gracefully rebooted instead of being hard killed on RHEL 8:

Sep 18 16:19:11  rhel8-1 stonith-ng[8161]:  notice: Operation reboot of rhel8-1 by rhel8-2 for crmd.20238@uxplpsgrd03.8b66209c: OK
Sep 18 16:19:11  rhel8-1 crmd[20238]:    crit: We were allegedly just fenced by rhel8-1 for rhel8-2!
Sep 18 16:19:11 rhel8-1 systemd-logind[792]: Session 1 logged out. Waiting for processes to exit.
Sep 18 16:19:11 rhel8-1 systemd-logind[792]: Removed session 1.

Environment

Red Hat Enterprise Linux (RHEL) 7, 8, 9 with the High Availability Add-On
One or more pacemaker cluster nodes (orpacemakerremote nodes) associated with astonith` device that uses a power-method which connects to a BMC or system-management controller like an iLO, RSA, DRAC, iDRAC, etc.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

Fencing fails in a RHEL 7, 8, 9 High Availability cluster because systemd initiates a graceful shutdown

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links