fencing fails with 'No such device' or a stonith device is listed as "Stopped" in a RHEL 6 or 7 High Availability cluster with symmetric-cluster=false set
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add-On
pacemaker- Cluster property
symmetric-clusteris set tofalsein the CIB
Issue
- Pacemaker cluster stonith resources show as stopped, and are unable to run anywhere:
pengine: info: native_print: node1_fence (stonith:fence_ilo4): Stopped
pengine: info: native_print: node2_fence (stonith:fence_ilo4): Stopped
pengine: info: native_print: node3_fence (stonith:fence_ilo4): Stopped
pengine: info: native_print: node4_fence (stonith:fence_ilo4): Stopped
pengine: info: native_color: Resource node1_fence cannot run anywhere
pengine: info: native_color: Resource node2_fence cannot run anywhere
pengine: info: native_color: Resource node3_fence cannot run anywhere
pengine: info: native_color: Resource node4_fence cannot run anywhere
- The pcs stonith fence command fails:
# pcs stonith fence node4
Error: unable to fence 'node4'
Command failed: No such device
- stonith fences fail with this message in the logs:
crmd: notice: tengine_stonith_notify: Peer node4 was not terminated (reboot) by node4 for node4: No such device (ref=50574ae6-54fa-4201-aac7-d9b20fa55377) by client stonith_admin.33126
- My
stonithdevice won't start, and just stays listed as "Stopped" - Why isn't the cluster starting my
stonithdevice?
Resolution
- Use
pcs property set symmetric-cluster=trueto allow resources to run anywhere by default. You can then use constraints to limit where the resources are able to run, and control the priority of each node. Alternatively if you needsymmetric-cluster=falseset be sure to create constraints to allow the fence devices to run.
Root Cause
If a cluster has symmetric-cluster=false set, then stonith devices must have a constraint assigning them to a specific node or nodes. Without such a constraint, the cluster reports that it has no fence devices with which to fence a node, and will not attempt to start monitoring those devices.
This is expected behavior, given that symmetric-cluster is intended to prevent resources from starting just anywhere, and instead requires them to be specifically designated to run on certain nodes. stonith devices are just a special class of resource, so they are treated the same.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments