Mounting a GFS2 file system blocks after a reboot of one node, and one or more other nodes do not show the gfs mountgroup for that file system in 'cman_tool services' in RHEL 6
Issue
mount.gfs2blocked 120 seconds and showed a backtrace in the logs
Mar 7 23:49:32 node2 kernel: INFO: task mount.gfs2:8129 blocked for more than 120 seconds.
Mar 7 23:49:32 node2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 7 23:49:32 node2 kernel: mount.gfs2 D 0000000000000000 0 8129 1 0x00000080
Mar 7 23:49:32 node2 kernel: ffff880229ebd9d8 0000000000000086 ffff880200000008 ffffffffa0490520
Mar 7 23:49:32 node2 kernel: ffff88023a385950 ffffffffa04904d0 ffff880200000005 ffff88023a385a00
Mar 7 23:49:32 node2 kernel: ffff88023b7efaf8 ffff880229ebdfd8 000000000000fb88 ffff88023b7efaf8
Mar 7 23:49:32 node2 kernel: Call Trace:
Mar 7 23:49:32 node2 kernel: [<ffffffffa0490520>] ? gdlm_ast+0x0/0x210 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa04904d0>] ? gdlm_bast+0x0/0x50 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa0470870>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa047087e>] gfs2_glock_holder_wait+0xe/0x20 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffff8150e82f>] __wait_on_bit+0x5f/0x90
Mar 7 23:49:32 node2 kernel: [<ffffffffa0470870>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffff8150e8d8>] out_of_line_wait_on_bit+0x78/0x90
Mar 7 23:49:32 node2 kernel: [<ffffffff81096cc0>] ? wake_bit_function+0x0/0x50
Mar 7 23:49:32 node2 kernel: [<ffffffffa0472ae5>] gfs2_glock_wait+0x45/0x90 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa0473f83>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa0474271>] gfs2_glock_nq_num+0x61/0xa0 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa047fbff>] init_journal+0x14f/0x4d0 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa047fa24>] ? gfs2_jindex_hold+0x1a4/0x230 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa047ffb7>] init_inodes+0x37/0x170 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa0480b98>] gfs2_get_sb+0x828/0xa00 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffffa0474269>] ? gfs2_glock_nq_num+0x59/0xa0 [gfs2]
Mar 7 23:49:32 node2 kernel: [<ffffffff8116087a>] ? alloc_pages_current+0xaa/0x110
Mar 7 23:49:32 node2 kernel: [<ffffffff8118381b>] vfs_kern_mount+0x7b/0x1b0
Mar 7 23:49:32 node2 kernel: [<ffffffff811839c2>] do_kern_mount+0x52/0x130
Mar 7 23:49:32 node2 kernel: [<ffffffff811a3c12>] do_mount+0x2d2/0x8d0
Mar 7 23:49:32 node2 kernel: [<ffffffff811a42a0>] sys_mount+0x90/0xe0
Mar 7 23:49:32 node2 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
- After a hard reboot of one node, it gets stuck when trying to mount a GFS2 file system after rejoining the cluster. The last message in the log is:
Mar 7 23:46:50 node2 kernel: GFS2: fsid=myCluster:fs1.0: Joined cluster. Now mounting FS...
- One or more nodes in the cluster have a GFS2 file system mounted and show an entry for it under "dlm lockspaces" in the
cman_tool servicesoutput, but do not show an entry undergfs mountgroups.
# cman_tool services
fence domain
member count 2
victim count 0
victim now 0
master nodeid 1
wait state none
members 1 2
dlm lockspaces
name clvmd
id 0x4104eefa
flags 0x00000000
change member 2 joined 1 remove 0 failed 0 seq 1,1
members 1 2
name fs1
id 0x58aa977e
flags 0x00000008 fs_reg
change member 2 joined 1 remove 0 failed 0 seq 4,4
members 1 2
name fs2
id 0xe22d3136
flags 0x00000000
change member 2 joined 1 remove 0 failed 0 seq 4,4
members 1 2
gfs mountgroups
name fs2
id 0x9661e92d
flags 0x00000048 mounted
change member 2 joined 1 remove 0 failed 0 seq 4,4
members 1 2
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the Resilient Storage Add On
- GFS2
gfs2-utilsandcmanreleases prior to3.0.12.1-59.el6_5.3in RHEL 6 Update 5, or prior to3.0.12.1-68.el6in other RHEL 6 updates
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
