Fencing shows success, but the node doesn't reboot instead it shutdown the pacemaker and the node remain up and running.

Solution Verified - Updated -

Issue

  • Fencing is successful but node is not rebooting

Fencing initiated

node01 stonith-ng[132799]:  notice: Client stonith_admin.189382.9771f5e7 wants to fence (reboot) 'node02' with device '(any)'
node01 stonith-ng[132799]:  notice: Requesting peer fencing (reboot) of node02
node01 stonith-ng[132799]:  notice: fence_hb2_ipmi can fence (reboot) node02: static-list
node01 stonith-ng[132799]:  notice: fence_hb1_ipmi can not fence (reboot) node02: static-list

Fencing acknowledgment received.

node01 stonith-ng[132799]:  notice: Operation 'reboot' [189383] (call 2 from stonith_admin.189382) for host 'node02' with device 'fence_hb2_ipmi' returned: 0 (OK)
node01 stonith-ng[132799]:  notice: Call to fence_hb2_ipmi for 'node02 reboot' on behalf of stonith_admin.189382@node01-hb: OK (0)
node01 stonith-ng[132799]:  notice: Operation reboot of node02 by node01-hb for stonith_admin.189382@node01-hb.cf540f32: OK

On node02 fencing request received.

node02 stonith-ng[347026]:  notice: fence_hb2_ipmi can fence (reboot) node02: static-list
node02 stonith-ng[347026]:  notice: fence_hb1_ipmi can not fence (reboot) node02: static-list

Instead of reboot the node it shutdown pacemaker

node02 stonith-ng[347026]:  notice: Operation reboot of node02 by node01-hb for stonith_admin.189382@node01-hb.cf540f32: OK
node02 crmd[347030]:    crit: We were allegedly just fenced by node01-hb for node01-hb!
node02 pacemakerd[347024]: warning: The crmd process (347030) can no longer be respawned, shutting the cluster down.
node02 lrmd[347027]: warning: new_event_notification (347027-347030-8): Bad file descriptor (9)
node02 lrmd[347027]: warning: Notification of client crmd/c192fc35-f930-4869-ac66-bdc68ca81745 failed
node02 pacemakerd[347024]:  notice: Shutting down Pacemaker                                  <<-----------------------------
node02 pacemakerd[347024]:  notice: Stopping pengine
node02 pengine[347029]:  notice: Caught 'Terminated' signal

Environment

  • Red Hat Enterprise Linux (RHEL) 7 with the High Availability Add On
  • Cluster nodes with hardware that supports IPMI power management via fence_ipmilan

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content