- fencing fails because
systemd-logindhandles the "power button" signal and initiates a graceful shutdown instead of powercycling the system.
- When a node fenced the other, we see that node process a power-button press and starts to shut down. All the while, fencing fails on the other node, seemingly for taking too long
- Do we need to disable acpi / acpid in RHEL 7 clusters like we did in previous releases?
- Do I need to do anything in addition to disabling ACPI on RHEL 7 cluster nodes to avoid it softly shutting down? For example:
Aug 13 21:07:22 node01 systemd-logind: Power key pressed. Aug 13 21:07:22 node01 systemd-logind: Powering Off... Aug 13 21:07:22 node01 systemd-logind: System is powering down. Aug 13 21:07:42 node02 stonith-ng: notice: log_operation: Operation 'reboot'  for device 'node01-ilo' returned: -62 (Timer expired)
- A cluster node gracefully rebooted instead of being hard killed:
Nov 2 10:57:01 node41 stonith-ng: notice: Operation reboot of node42 by uxplpsgrd01 for firstname.lastname@example.org: OK Nov 2 10:57:01 node42 crmd: crit: We were allegedly just fenced by node41 for node42! Nov 2 10:57:01 node42 stonith-ng: notice: Operation reboot of node42 by node41 for email@example.com: OK Nov 2 10:57:01 node42 systemd-logind: Power key pressed.
- Red Hat Enterprise Linux (RHEL) 7 with the High Availability Add-On
- One or more
pacemaker cluster nodes (orpacemaker
remote nodes) associated with astonith` device that uses a power-method which connects to a BMC or system-management controller like an iLO, RSA, DRAC, iDRAC, etc.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.