fence_aws fence action fails with "Timed out waiting to power OFF" and then "Unable to obtain correct plug status or plug is not available" when a node is panicked in a High Availability cluster

Solution Verified - Updated 2025-11-19T14:47:10+00:00 -

Issue

When an AWS Pacemaker cluster node experiences a kernel panic or is crashed by running echo c > /proc/sysrq-trigger, the fence action against it fails with "Timed out waiting to power OFF". When the action is retried, it fails repeatedly with "Unable to obtain correct plug status or plug is not available".
Fencing the node manually with pcs stonith fence succeeds.
Issue may be intermittent.

Apr  2 22:18:04 ip-10-0-0-17 corosync[14259]: [TOTEM ] A processor failed, forming new configuration.
Apr  2 22:18:05 ip-10-0-0-17 corosync[14259]: [TOTEM ] A new membership (10.0.0.17:177) was formed. Members left: 2
Apr  2 22:18:05 ip-10-0-0-17 corosync[14259]: [TOTEM ] Failed to receive the leave message. failed: 2
Apr  2 22:18:05 ip-10-0-0-17 corosync[14259]: [CPG   ] downlist left_list: 1 received
Apr  2 22:18:05 ip-10-0-0-17 corosync[14259]: [QUORUM] Members[1]: 1
Apr  2 22:18:05 ip-10-0-0-17 corosync[14259]: [MAIN  ] Completed service synchronization, ready to provide service.
Apr  2 22:18:05 ip-10-0-0-17 pacemakerd[14283]:  notice: Node node2 state is now lost
...
Apr  2 22:18:06 ip-10-0-0-17 crmd[14289]:  notice: Requesting fencing (reboot) of node node2
...
Apr  2 22:19:11 ip-10-0-0-17 fence_aws: Failed: Timed out waiting to power OFF
...
Apr  2 22:19:11 ip-10-0-0-17 stonith-ng[14285]:   error: Operation 'reboot' [18866] (call 42 from crmd.14289) for host 'node2' with device 'aws_fence' returned: -62 (Timer expired)
...
Apr  2 22:19:11 ip-10-0-0-17 crmd[14289]:  notice: Peer node2 was not terminated (reboot) by node1 on behalf of crmd.14289: Timer expired
...
Apr  2 22:19:11 ip-10-0-0-17 pengine[14288]: warning: Cluster node node2 will be fenced: peer is no longer part of the cluster
...
Apr  2 22:19:11 ip-10-0-0-17 crmd[14289]:  notice: Requesting fencing (reboot) of node node2
...
Apr  2 22:19:12 ip-10-0-0-17 fence_aws: Failed: Unable to obtain correct plug status or plug is not available
...
Apr  2 22:19:14 ip-10-0-0-17 fence_aws: Failed: Unable to obtain correct plug status or plug is not available
...
Apr  2 22:19:14 ip-10-0-0-17 stonith-ng[14285]:   error: Operation 'reboot' [19126] (call 43 from crmd.14289) for host 'node2' with device 'aws_fence' returned: -201 (Generic Pacemaker error)
Apr  2 22:19:14 ip-10-0-0-17 stonith-ng[14285]:  notice: Couldn't find anyone to fence (reboot) node2 with any device
Apr  2 22:19:14 ip-10-0-0-17 stonith-ng[14285]:   error: Operation reboot of node2 by <no-one> for crmd.14289@node1.5abdec11: No route to host

Environment

Red Hat Enterprise Linux 7, 8, 9, 10 (with the High Availability Add-on)
Amazon Web Services (AWS) EC2 instances as cluster nodes

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

fence_aws fence action fails with "Timed out waiting to power OFF" and then "Unable to obtain correct plug status or plug is not available" when a node is panicked in a High Availability cluster

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links