Why does a fencing operation complete with success but the node is not rebooted and instead pacemaker is shut down on that node?

Solution Verified - Updated -

Issue

  • While performing a manual fence operation using pcs stonith fence node2, the command completes without error and actually the node2 does not get rebooted - it remains up and running and just the pacemaker stack gets shut down on the node2:

    [root@node1 ~]# pcs stonith fence node2
    Node: node2 fenced
    

    Logs from node1:

    stonith-ng[1485]:  notice: Client stonith_admin.6884.651b3028 wants to fence (reboot) 'node2' with device '(any)'
    stonith-ng[1485]:  notice: Requesting peer fencing (reboot) of node2
    stonith-ng[1485]:  notice: fence_node1 can not fence (reboot) node2: static-list
    stonith-ng[1485]:  notice: fence_node2 can fence (reboot) node2 (aka. 'vm-node2'): static-list
    stonith-ng[1485]:  notice: fence_node1 can not fence (reboot) node2: static-list
    stonith-ng[1485]:  notice: fence_node2 can fence (reboot) node2 (aka. 'vm-node2'): static-list
    stonith-ng[1485]:  notice: Operation 'reboot' [6885] (call 2 from stonith_admin.6884) for host 'node2' with device 'fence_node2' returned: 0 (OK)
    stonith-ng[1485]:  notice: Operation reboot of node2 by node1 for stonith_admin.6884@node1.671eb0a4: OK
    crmd[1490]:  notice: Peer node2 was terminated (reboot) by node1 on behalf of stonith_admin.6884: OK
    

    Logs from node2:

    stonith-ng[3038]:  notice: fence_node1 can not fence (reboot) node2: static-list
    stonith-ng[3038]:  notice: fence_node2 can fence (reboot) node2 (aka. 'vm-node2'): static-list
    stonith-ng[3038]:  notice: Operation reboot of node2 by node1 for stonith_admin.6884@node1.671eb0a4: OK
    crmd[3042]:    crit: We were allegedly just fenced by node1 for node1!
    cib[3037]: warning: new_event_notification (3037-3042-13): Broken pipe (32)
    cib[3037]: warning: Notification of client crmd/0fff530d-0e73-42f0-bbac-c0100a4ab62b failed
    pacemakerd[3036]: warning: The crmd process (3042) can no longer be respawned, shutting the cluster down.
    lrmd[3039]: warning: new_event_notification (3039-3042-8): Bad file descriptor (9)
    lrmd[3039]: warning: Could not notify client crmd/e36d493d-4cd9-4e96-946d-cf1165dbfe2c: Bad file descriptor
    pacemakerd[3036]:  notice: Shutting down Pacemaker
    pacemakerd[3036]:  notice: Stopping pengine
    pengine[3041]:  notice: Caught 'Terminated' signal
    pacemakerd[3036]:  notice: Stopping attrd
    attrd[3040]:  notice: Caught 'Terminated' signal
    lrmd[3039]:  notice: Caught 'Terminated' signal
    pacemakerd[3036]:  notice: Stopping lrmd
    pacemakerd[3036]:  notice: Stopping stonith-ng
    stonith-ng[3038]:  notice: Caught 'Terminated' signal
    cib[3037]: warning: new_event_notification (3037-3038-11): Broken pipe (32)
    cib[3037]: warning: Notification of client stonithd/7c8647e4-b100-4d2b-b9bf-b4c94fdc6e80 failed
    cib[3037]: warning: new_event_notification (3037-3040-12): Broken pipe (32)
    cib[3037]: warning: Notification of client attrd/0bcd95ee-116f-4423-8b89-666d97583822 failed
    pacemakerd[3036]:  notice: Stopping cib
    cib[3037]:  notice: Caught 'Terminated' signal
    cib[3037]:  notice: Disconnected from Corosync
    cib[3037]:  notice: Disconnected from Corosync
    pacemakerd[3036]:  notice: Shutdown complete
    pacemakerd[3036]:  notice: Attempting to inhibit respawning after fatal error
    
  • The same behavior is seen during stonith operations issued by the cluster itself but it cannot be reproduced when issuing the operation manually using the command fence_vmware_soap or fence_vmware_rest.

  • The hypervisor may report any of the following messages in the logs of the virtual machine which was fenced:

    Task: Reset virtual machine
    

    or:

    Task: Reconfigure virtual machine
    
  • From the point of view of the rest of the nodes in the cluster, the node which was fenced appears as Pending.

Environment

Red Hat Enterprise Linux 7 with High-Availability or Resilient Storage Add-Ons
Pacemaker cluster
VMware hypervisor

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content