Red Hat Insights can detect this issue
- Red Hat Enterprise Linux (RHEL) 5, 6, 7, 8 with the High Availability Add On
- One or more stonith or fence devices configured to use agent:
- Or a fencing device configured with
cman-based clusters or in the CIB for
- A node had trouble communicating and the cluster decided to fence it and take over its resources, but it seems that another node mounted file system resources before the node got powered off, and data was corrupted.
fence_ipmilanreturns success before a node actually gets powered off
- A node failed to stop a resource and so needed to be fenced, and somehow that node was still alive to log the completion of that fence action from another node. How can this be possible if the node should have powered off before fencing completed?
Aug 17 08:33:08 node2 stonith-ng: notice: remote_op_done: Operation reboot of node2 by node1 for email@example.com: OK
- When a node is fenced in my
pacemakercluster due to a resource stop timeout, the rest of the cluster logs "telling cman to remove nodeid 9 from cluster", the membership changes, but GFS2 access stays blocked. All nodes log "Trying to acquire journal lock" but nothing else happens. We only see this behavior with
IMPORTANT: Configure all IMPI based fencing agent such as
fence_idrac devices to use
method=onoff (the default in most cases) instead of
cycle and make sure that cluster node is configured to power off immediately for RHEL 5, 6 cluster nodes or powered off immediately for RHEL 7 cluster nodes.
If you have declared the attribute
method to have a value of
cycle for any fence-agent then you should modify it so that the
method attribute has a value of
There are multiple fence-agents that have a default of
cycle for the
method attribute. If you are using one of the following fence-agents below then add the attribute
method=onoff to those configured fence-agents.
Update the stonith device configurations to not specify a method, or use
method=onoff instead. Leaving the value off of an attribute when updating causes it to be un-set and uses the default which we do not want to do in this case.
# pcs stonith update node1-ipmi method=onoff
Purely cman-based clusters
fencedevice definitions in
/etc/cluster/cluster.conf to use
<fencedevice name="node1-ipmi" agent="fence_ipmilan" ipaddr="node1-ipmi.example.com" userid="myuser" password="StrongPassword" lanplus="1" method="onoff"/>
RHEL 7 or later
The only fencing agent that defaults to
method=cycle on RHEL 7 is
fence_ilo3. There are two ways to change this:
fence_ilo3before installing RHBA-2018:0758 or later.
- Update the
fence-agentspackages with the following errata RHBA-2018:0758. The errata changes the default of
onofffor the fencing agent
- WORKAROUND: Update the stonith device configurations to not specify a method unless
fence_ilo3(and before errata above or later was installed), or use
method=onoffinstead. Leaving the value off of an attribute when updating causes it to be un-set and uses the default.
# pcs stonith update node1-ipmi method=onoff
NOTE: RHEL 8 or later defaults to
onoff for the attribute
method for all fence-agents that use the
fence_ipmilan offers a special method attribute that controls how a
reboot operation is carried out. If using the default value of
onoff, then the agent sends a power-off command to the device, then sends a power-on, and evaluates the results of those and reports that back as the exit status. This ensures that no successful return code can be sent back to the cluster stack until a node is successfully powered off.
However, the alternate value of
cycle results in the agent issuing a single command to the hardware device telling it to cycle the node itself. This relies on the device firmware carrying out the action in the proper way and reporting the status successfully, since both before and after the status of the server will be "on", so there is no way to confirm that it actually powered off. Some server make/model firmwares might actually return a successful status from this
cycle request before proceeding to power off the server. The end result is that the fence agent may believe the operation was a success several seconds or more before a node actually powers off.
This can cause problems for the cluster stack in a few ways, the most significant of which being that the successful completion of fencing signals to the resource manager on other nodes to start recovering resources that were running on fenced nodes, meaning those resources have the potential to run on two nodes simultaneously. If one node thinks the other has powered off and, for example, takes over a file system resource, mounts it, and submits I/O to it, all while the other node is still issuing I/O to it itself, data corruption could ensue.
While this ultimately would be a problem on the IPMI-device firmware side, Red Hat is considering whether a change is necessary to prevent usage of the
cycle method within High Availability clusters, or whether there is some alternative solution that could prevent issues like this. This investigation is occurring in Red Hat Bugzilla #1271780.
This applies to all IMPI based fencing agent such as
- To demonstrate the nature of this problem, simply execute
fence_ipmilan -o reboot -m cycle [...]from one node against another node's fence device, then interact with a console on that fenced node constantly while waiting for the
fence_ipmilanoperation to complete. If the node is still responsive on its console or ssh session after the
fence_ipmilancommand has exited with a success status, then the cluster is susceptible to unexpected behavior when using the
cyclemethod, and it should be avoided.
- Red Hat Enterprise Linux
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.