A singular cluster-fence attempt never completes and cluster operations block during this time in a RHEL 6 Resilient Storage cluster with gfs2
Issue
- After a node left the cluster for unknown reasons, there was a GFS2 deadlock
- A node required fencing, but that fencing never completed and everything in the cluster blocked
fence_ipmilanhung when trying to fence a nodefence_ipmilannever returned when it was called to fence another node
Apr 13 10:48:27 node1 corosync[1984]: [TOTEM ] A processor failed, forming new configuration.
Apr 13 10:49:28 node1 corosync[1984]: [QUORUM] Members[8]: 1 2 3 5 6 7 8 9
Apr 13 10:49:28 node1 corosync[1984]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 13 10:49:28 node1 kernel: dlm: closing connection to node 4
Apr 13 10:49:28 node1 corosync[1984]: [CPG ] chosen downlist: sender r(0) ip(10.0.0.9) ; members(old:9 left:1)
Apr 13 10:49:28 node1 corosync[1984]: [MAIN ] Completed service synchronization, ready to provide service.
Apr 13 10:49:28 node1 fenced[2061]: fencing node node2
Apr 13 10:49:28 node1 kernel: GFS2: fsid=cluster:lv1.8: jid=7: Trying to acquire journal lock...
Apr 13 10:49:28 node1 kernel: GFS2: fsid=cluster:lv2.8: jid=4: Trying to acquire journal lock...
Apr 13 10:52:08 node1 kernel: INFO: task kswapd0:100 blocked for more than 120 seconds.
[...]
Apr 13 10:52:08 node1 kernel: INFO: task glock_workqueue:2403 blocked for more than 120 seconds.
[...]
Apr 13 10:52:08 node1 kernel: INFO: task glock_workqueue:2404 blocked for more than 120 seconds.
[...]
Apr 13 10:52:08 node1 kernel: INFO: task glock_workqueue:2405 blocked for more than 120 seconds.
fence_vmware_soapbegan fencing, but never finished. gfs2 throughout the cluster blocked that entire time and didn't recover on its own.
Environment
- Red Hat Enterprise Linux (RHEL) 6 w/ the Resilient Storage Add-On
- Red Hat Enterprise Linux (RHEL) 7 w/ the Resilient Storage Add-On
- One or more gfs2 file systems mounted within the cluster
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.