Cluster node kernel panics with "kernel BUG at fs/locks.c:2037!" in locks_remove_flock after a GFS2 withdrawl in RHEL 6
Issue
One of the cluster node panic after a GFS2 withdraw because of IO errors:
lost page write due to I/O error on dm-5
GFS2: fsid=cluster_emsp1v:emslv.0: fatal: I/O error
GFS2: fsid=cluster_emsp1v:emslv.0: block = 21676
GFS2: fsid=cluster_emsp1v:emslv.0: function = log_write_header, file = fs/gfs2/log.c, line = 616
GFS2: fsid=cluster_emsp1v:emslv.0: about to withdraw this file system
GFS2: fsid=cluster_emsp1v:emslv.0: telling LM to unmount
GFS2: fsid=cluster_emsp1v:emslv.0: withdrawn
Pid: 3785, comm: tibemsd64 Not tainted 2.6.32-279.el6.x86_64 #1
Call Trace:
[<ffffffffa02a9062>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
[<ffffffff814fea28>] ? out_of_line_wait_on_bit+0x78/0x90
[<ffffffff81092110>] ? wake_bit_function+0x0/0x50
[<ffffffffa02a90d0>] ? gfs2_io_error_bh_i+0x40/0x50 [gfs2]
[<ffffffff811adfb6>] ? __wait_on_buffer+0x26/0x30
[<ffffffffa0291288>] ? log_write_header+0x3a8/0x490 [gfs2]
[<ffffffffa0291951>] ? gfs2_log_flush+0x301/0x6f0 [gfs2]
[<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa028b593>] ? gfs2_holder_uninit+0x23/0x40 [gfs2]
[<ffffffffa02a6282>] ? gfs2_write_inode+0x252/0x2f0 [gfs2]
[<ffffffff811a5134>] ? writeback_single_inode+0x204/0x2c0
[<ffffffff811a5223>] ? sync_inode+0x33/0x50
[<ffffffff811a5274>] ? sync_inode_metadata+0x34/0x40
[<ffffffff81114978>] ? filemap_write_and_wait_range+0x78/0x90
[<ffffffffa0297415>] ? gfs2_fsync+0x55/0xa0 [gfs2]
[<ffffffff811a9fd1>] ? vfs_fsync_range+0xa1/0xe0
[<ffffffff811aa05b>] ? generic_write_sync+0x4b/0x50
[<ffffffff8111673e>] ? generic_file_aio_write+0xbe/0xe0
[<ffffffffa02978be>] ? gfs2_file_aio_write+0x7e/0xb0 [gfs2]
[<ffffffff810a43fe>] ? futex_wake+0x10e/0x120
[<ffffffff8117ad6a>] ? do_sync_write+0xfa/0x140
[<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa0297e73>] ? gfs2_llseek+0x33/0xb0 [gfs2]
[<ffffffff81213136>] ? security_file_permission+0x16/0x20
[<ffffffff8117b068>] ? vfs_write+0xb8/0x1a0
[<ffffffff810d69e2>] ? audit_syscall_entry+0x272/0x2a0
[<ffffffff8117ba81>] ? sys_write+0x51/0x90
[<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
------------[ cut here ]------------
kernel BUG at fs/locks.c:2037!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/dlm/emslv/event_done
CPU 0
Modules linked in: ext2 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 raid0 linear vfat msdos fat autofs4 gfs2 sunrpc dlm configfs ipv6 uinput ppdev parport_pc parport sg microcode vmware_balloon vmxnet3 i2c_piix4 i2c_core shpchp ext3 jbd mbcache sd_mod crc_t10dif sr_mod cdrom vmw_pvscsi pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Pid: 3779, comm: tibemsd64 Not tainted 2.6.32-279.el6.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[<ffffffff811c668d>] [<ffffffff811c668d>] locks_remove_flock+0xfd/0x120
RSP: 0018:ffff88013a087af8 EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff88013a6e4bc0 RCX: 0000000000000000
RDX: ffff88013a69cae0 RSI: 0000000000000002 RDI: ffff880137af66c0
RBP: ffff88013a087bc8 R08: 0000000000000001 R09: 0000000000000002
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880124113600
R13: ffff88013a087af8 R14: ffff880124119c80 R15: ffff88013936be80
FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003f02001dde CR3: 0000000001a85000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process tibemsd64 (pid: 3779, threadinfo ffff88013a086000, task ffff88013a69cae0)
Stack:
0000000000000000 0000000000000000 0000000000000000 0000000000000000
<d> 0000000000000000 0000000000000000 00000ebd00000202 0000000000000000
<d> 0000000000000000 0000000000000000 0000000000000000 ffff88013a6e4bc0
Call Trace:
[<ffffffff8117c910>] __fput+0xd0/0x210
[<ffffffff8117ca75>] fput+0x25/0x30
[<ffffffff8117849d>] filp_close+0x5d/0x90
[<ffffffff8106e3bf>] put_files_struct+0x7f/0xf0
[<ffffffff8106e483>] exit_files+0x53/0x70
[<ffffffff810704fd>] do_exit+0x18d/0x870
[<ffffffff81070c38>] do_group_exit+0x58/0xd0
[<ffffffff81085866>] get_signal_to_deliver+0x1f6/0x460
[<ffffffff81191e08>] ? d_free+0x58/0x60
[<ffffffff8100a2d5>] do_signal+0x75/0x800
[<ffffffff8117ca75>] ? fput+0x25/0x30
[<ffffffff8119a330>] ? mntput_no_expire+0x30/0x110
[<ffffffff81179183>] ? sys_fchmodat+0x73/0x100
[<ffffffff8100aaf0>] do_notify_resume+0x90/0xc0
[<ffffffff8100b3c1>] int_signal+0x12/0x17
Code: 49 89 c4 49 8b 04 24 48 85 c0 75 ee e8 ad 9c 33 00 48 81 c4 b8 00 00 00 5b 41 5c 41 5d c9 c3 0f b6 40 30 a8 02 75 09 a8 20 75 0f <0f> 0b 90 eb fd 4c 89 e7 e8 e6 fc ff ff eb b7 be 02 00 00 00 4c
RIP [<ffffffff811c668d>] locks_remove_flock+0xfd/0x120
RSP <ffff88013a087af8>
Environment
- Red Hat Enterprise Linux 6 with the Reslilient Storage Add On
- GFS2
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.