Kernel panic in gfs2_inplace_reserve on Red Hat Enterprise Linux 6.4
Issue
- Doing a "service netbackup start" causes a kernel panic when starting "NetBackup Audit Manager."
kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
kernel: IP: [<ffffffffa05e226a>] gfs2_inplace_reserve+0xca/0x7e0 [gfs2]
kernel: PGD bf2ae3067 PUD bf2ae4067 PMD 0
kernel: Oops: 0002 [#1] SMP
kernel: last sysfs file: <random sysfs file>
kernel: CPU 2
kernel: Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle autofs4 gfs2 dlm
configfs sunrpc bridge bonding 8021q garp stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter
ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap
macvlan tun kvm_intel kvm hpwdt hpilo bnx2 ch osst st sg microcode serio_raw iTCO_wdt iTCO_vendor_support i5000_edac edac_core
i5k_amb shpchp sd_mod crc_t10dif ext4 mbcache jbd2 hpsa cciss qla2xxx scsi_transport_fc scsi_tgt radeon ttm drm_kms_helper drm
i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
kernel:
kernel: Pid: 7453, comm: nbaudit Not tainted 2.6.32-358.el6.x86_64 #1 HP ProLiant BL460c G1
kernel: RIP: 0010:[<ffffffffa05e226a>] [<ffffffffa05e226a>] gfs2_inplace_reserve+0xca/0x7e0 [gfs2]
kernel: RSP: 0018:ffff880bf2b6dc58 EFLAGS: 00010202
kernel: RAX: 0000000000000001 RBX: ffff880bf8f0c040 RCX: ffff880c14fcd328
kernel: RDX: 000000000036ff01 RSI: 000000000036be8c RDI: ffff880bf8f0c040
kernel: RBP: ffff880bf2b6dd18 R08: 9050000000000000 R09: f3ef4c1ee58f720a
kernel: R10: 00000000000007de R11: 0000000000000004 R12: 0000000000002000
kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
kernel: FS: 00007ff94f3ea720(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000000060 CR3: 0000000bf2ae2000 CR4: 00000000000007e0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
kernel: Process nbaudit (pid: 7453, threadinfo ffff880bf2b6c000, task ffff880bf29a2ae0)
kernel: Stack:
kernel: ffff880bf2b6ddc8 000000000000000a ffff880bf2b6dc88 ffffffff81096c6f
kernel: <d> ffff880bf2b6dd98 00000007f93f1818 ffff880bf2b6dc98 ffff880c126a8000
kernel: <d> ffff880bf2b6dce8 ffffffffa05ca2a8 ffff880c14fcd328 0000000000000000
kernel: Call Trace:
kernel: [<ffffffff81096c6f>] ? wake_up_bit+0x2f/0x40
kernel: [<ffffffffa05ca2a8>] ? do_promote+0x208/0x330 [gfs2]
kernel: [<ffffffffa05bd06e>] gfs2_setattr_size+0xce/0x210 [gfs2]
kernel: [<ffffffffa05d9534>] gfs2_setattr+0x214/0x330 [gfs2]
kernel: [<ffffffffa05d9366>] ? gfs2_setattr+0x46/0x330 [gfs2]
kernel: [<ffffffff8119e688>] notify_change+0x168/0x340
kernel: [<ffffffff8117f104>] do_truncate+0x64/0xa0
kernel: [<ffffffff81228aeb>] ? selinux_path_truncate+0x7b/0xb0
kernel: [<ffffffff8117f28c>] vfs_truncate+0x14c/0x170
kernel: [<ffffffff8117f30b>] sys_truncate+0x5b/0x70
kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
kernel: Code: 8b 4d 90 48 8b b2 10 03 00 00 8b 51 34 48 8b 41 28 48 01 c2 48 39 d6 0f 92 c2 48 39 c6 0f 93 c0 0f b6 c0 85 d0 0f
84 70 04 00 00 <49> 89 4d 60 48 89 c8 c7 45 9c 01 00 00 00 49 8d 75 08 49 8d 55
kernel: RIP [<ffffffffa05e226a>] gfs2_inplace_reserve+0xca/0x7e0 [gfs2]
kernel: RSP <ffff880bf2b6dc58>
kernel: CR2: 0000000000000060
kernel: ---[ end trace 2136841198e5d521 ]---
- Our HA cluster is crashing after we performed a batch of updates last night. We have traced the problem to the courier-imap packages that are installed on our servers. Whenever the service is started on top of the GFS file-systems it crashes like this. The setup has been running for +/- 2years so we're confident something in an update has broken this. I have the an extract from a crash report pasted below.
BUG: unable to handle kernel NULL pointer dereference at 00000034
IP: [<f8625d75>] gfs2_inplace_reserve+0x605/0x8f0 [gfs2]
*pdpt = 000000002acce001 *pde = 00000000b06de067
Oops: 0002 [#1] SMP
last sysfs file: <random sysfs file>
Modules linked in: gfs2 dlm configfs iptable_filter ip_tables autofs4 sunrpc dm_round_robin be2iscsi iscsi_boot_sysfs bnx2i cnic
uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp
libiscsi scsi_transport_iscsi cpufreq_ondemand acpi_cpufreq mperf bonding 8021q garp stp llc ipv6 dm_multipath ipmi_devintf igb
ptp pps_core e1000e raid1 microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core
ext4 mbcache jbd2 raid10 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: configfs]
Pid: 9098, comm: maildrop Not tainted 2.6.32-358.el6.i686 #1 Supermicro X8DTT-H/X8DTT-H
EIP: 0060:[<f8625d75>] EFLAGS: 00210282 CPU: 7
EIP is at gfs2_inplace_reserve+0x605/0x8f0 [gfs2]
EAX: efe514b8 EBX: 00000000 ECX: 00000000 EDX: 00f106dd
ESI: 00000001 EDI: 00f0073b EBP: 00000000 ESP: ea3dde08
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process maildrop (pid: 9098, ti=ea3dc000 task=c1591000 task.ti=ea3dc000)
Stack:
00000001 e8f0f0b8 e8f0f0b8 00000000 00000f50 e8695f50 f86025b7 00001000
<0> e8f0f0b8 00001000 e8695000 e8695ed0 e8f0f0b8 efe514b8 eb8708f0 f24fe000
<0> 00000002 f8603a16 f8602300 00000115 00000000 00000000 f8602300 e7c44b70
Call Trace:
[<f86025b7>] ? gfs2_dirent_scan+0xf7/0x160 [gfs2]
[<f8603a16>] ? gfs2_dirent_search+0xf6/0x1c0 [gfs2]
[<f8602300>] ? gfs2_dirent_find_space+0x0/0x50 [gfs2]
[<f8602300>] ? gfs2_dirent_find_space+0x0/0x50 [gfs2]
[<f861bc7b>] ? gfs2_link+0x1cb/0x290 [gfs2]
[<f861bb0a>] ? gfs2_link+0x5a/0x290 [gfs2]
[<f861bb23>] ? gfs2_link+0x73/0x290 [gfs2]
[<c053d688>] ? vfs_link+0xe8/0x160
[<c0540216>] ? sys_linkat+0xe6/0x110
[<c05475ec>] ? dput+0x9c/0x110
[<c0532b94>] ? __fput+0x164/0x1f0
[<c054cb45>] ? mntput_no_expire+0x15/0xd0
[<c054026f>] ? sys_link+0x2f/0x40
[<c04099bf>] ? sysenter_do_call+0x12/0x28
Code: ff 8b 44 24 20 8b b8 d8 01 00 00 8b 88 dc 01 00 00 8b 44 24 3c 89 fa c7 04 24 01 00 00 00 e8 73 d3 ff ff 8b 5c 24 0c 89 44
24 34 <89> 43 34 8b 44 24 34 c7 44 24 30 01 00 00 00 e9 36 fa ff ff 8d
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the Resilient Storage Add On
- Issue occurs on kernel-2.6.32-358.el6 and kernel-2.6.32-358.0.1.el6 only.
- See this solution if a very similar issue occurs on kernel-2.6.32-358.2.1.el6
- Issue occurs on kernel-2.6.32-358.el6 and kernel-2.6.32-358.0.1.el6 only.
- GFS2 filesystems
- Bug is triggered by truncating files on the filesystem.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.