A RHEL 7 Resilient Storage cluster node is fenced or reboots due to a kernel panic after a gfs2 filesystem withdraws, with error "kernel BUG at fs/gfs2/glock.c:546!"

Solution Unverified - Updated -

Issue

  • A kernel panic occurs after a gfs2 filesystem withdraws
  • A node in our cluster was fenced right after it logged a GFS2 withdrawal error
  • My cluster nodes using GFS2 are frequently panicking and dumping cores, showing GFS2 metadata inconsistency errors in the logs just before
  • A kernel panic occurs after a cluster node was fenced by fence_scsi
[...]
[947302.618149] GFS2: fsid=tsmcluster:archlogfs.1: telling LM to unmount
[947302.618382] GFS2: fsid=tsmcluster:archlogfs.1: withdrawn
[...]
[947302.618519] GFS2: fsid=tsmcluster:archlogfs.1: Error -5 writing to log
[947302.618661] GFS2: lm_lock ret -22
[947302.618665]  G:  s:SH n:5/102bb f:lDpIqL t:UN d:UN/0 a:0 v:0 r:3 m:200
[947302.618697] ------------[ cut here ]------------
[947302.618698] kernel BUG at fs/gfs2/glock.c:546!
[947302.618699] invalid opcode: 0000 [#1] SMP
[...]
[....]
[  428.566409] GFS2: fsid=cluster1:gfs2fs.0: about to withdraw this file system
[  428.573202] GFS2: fsid=cluster1:gfs2fs.0: telling LM to unmount
[  428.574268] GFS2: fsid=cluster1:gfs2fs.0: withdrawn
[...]
[  428.574916] GFS2: lm_lock ret -22
[  428.575409]  G:  s:SH n:5/1f0421 f:lDpIqLo t:UN d:UN/0 a:0 v:0 r:3 m:200
[  428.575913] ------------[ cut here ]------------
[  428.576383] kernel BUG at fs/gfs2/glock.c:546!
[  428.576875] invalid opcode: 0000 [#1] SMP

Environment

  • Red Hat Enterprise Linux (RHEL) 7 with the Resilient Storage Add-On
  • kernel releases prior to 3.10.0-514.el7
  • One or more gfs2 file systems

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content