Translated message

A translation of this page exists in English.

在 RHEL 7、8、9 High Availability 集群中隔离失败,因为 systemd 启动了一个安全关闭

Solution In Progress - Updated -

Issue

  • 隔离失败,因为 systemd-logind 对 "电源按钮"信号进行处理并启动了一个安全关闭(graceful shutdown)过程,而不是进行电源关机过程。
  • 当某个节点隔离其他节点时,我们可以看到节点处理按电源按钮的行为,然后开始关闭过程。与此同时,我们可以看到在其他节点上隔离失败,似乎是因为用时过长造成的
  • 我们是否需要象以前的版本一样,在 RHEL 7 集群中禁用 acpi / acpid?
  • 除了在 RHEL 7 集群节点上禁用 ACPI 外,是否还需要进行其他操作来避免软关机?例如:
Aug 13 21:07:22 node01 systemd-logind: Power key pressed. 
Aug 13 21:07:22 node01 systemd-logind: Powering Off...
Aug 13 21:07:22 node01 systemd-logind: System is powering down.
Aug 13 21:07:42 node02 stonith-ng[2803]: notice: log_operation: Operation 'reboot' [3114] for device 'node01-ilo' returned: -62 (Timer expired)
  • RHEL 7 中,一个集群节点会被安全重启,而不是被硬终止:
Nov  2 10:57:01 node41 stonith-ng[8161]:  notice: Operation reboot of node42 by node42 for crmd.20238@uxplpsgrd03.8b66209c: OK
Nov 2 10:57:01 node42 crmd[20238]: crit: We were allegedly just fenced by node41 for node42!
Nov 2 10:57:01 node42 stonith-ng[20234]: notice: Operation reboot of node42 by node41 for crmd.20238@node42.8b66209c: OK
Nov 2 10:57:01 node42 systemd-logind: Power key pressed.
  • RHEL 8 中,一个集群节点会被安全重启,而不是被硬终止:
Sep 18 16:19:11  rhel8-1 stonith-ng[8161]:  notice: Operation reboot of rhel8-1 by rhel8-2 for crmd.20238@uxplpsgrd03.8b66209c: OK
Sep 18 16:19:11 rhel8-1 crmd[20238]: crit: We were allegedly just fenced by rhel8-1 for rhel8-2!
Sep 18 16:19:11 rhel8-1 systemd-logind[792]: Session 1 logged out. Waiting for processes to exit.
Sep 18 16:19:11 rhel8-1 systemd-logind[792]: Removed session 1.

Environment

  • 具有高可用性附加组件的 Red Hat Enterprise Linux (RHEL) 7、8 和 9
  • 一个或多个 pacemaker 集群节点(或pacemaker 远程节点)与 stonith` 设备关联,这个设备使用了一个基于电源的方法连接到 BMC 或系统管理控制器(如 iLO, RSA, DRAC, iDRAC, 等)。

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content