A RHEL 7 Resilient Storage cluster node is fenced or reboots due to a kernel panic after a gfs2 filesystem withdraws, with error "kernel BUG at fs/gfs2/glock.c:546!"
Issue
- A kernel panic occurs after a gfs2 filesystem withdraws
- A node in our cluster was fenced right after it logged a GFS2 withdrawal error
- My cluster nodes using GFS2 are frequently panicking and dumping cores, showing GFS2 metadata inconsistency errors in the logs just before
- A kernel panic occurs after a cluster node was fenced by
fence_scsi
[...]
[947302.618149] GFS2: fsid=tsmcluster:archlogfs.1: telling LM to unmount
[947302.618382] GFS2: fsid=tsmcluster:archlogfs.1: withdrawn
[...]
[947302.618519] GFS2: fsid=tsmcluster:archlogfs.1: Error -5 writing to log
[947302.618661] GFS2: lm_lock ret -22
[947302.618665] G: s:SH n:5/102bb f:lDpIqL t:UN d:UN/0 a:0 v:0 r:3 m:200
[947302.618697] ------------[ cut here ]------------
[947302.618698] kernel BUG at fs/gfs2/glock.c:546!
[947302.618699] invalid opcode: 0000 [#1] SMP
[...]
[....]
[ 428.566409] GFS2: fsid=cluster1:gfs2fs.0: about to withdraw this file system
[ 428.573202] GFS2: fsid=cluster1:gfs2fs.0: telling LM to unmount
[ 428.574268] GFS2: fsid=cluster1:gfs2fs.0: withdrawn
[...]
[ 428.574916] GFS2: lm_lock ret -22
[ 428.575409] G: s:SH n:5/1f0421 f:lDpIqLo t:UN d:UN/0 a:0 v:0 r:3 m:200
[ 428.575913] ------------[ cut here ]------------
[ 428.576383] kernel BUG at fs/gfs2/glock.c:546!
[ 428.576875] invalid opcode: 0000 [#1] SMP
Environment
- Red Hat Enterprise Linux (RHEL) 7 with the Resilient Storage Add-On
kernel
releases prior to3.10.0-514.el7
- One or more
gfs2
file systems
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.