Kernel panic in gfs2_inplace_reserve on Red Hat Enterprise Linux 6.4

Solution Unverified - Updated -

Issue

  • Doing a "service netbackup start" causes a kernel panic when starting "NetBackup Audit Manager."
kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
kernel: IP: [<ffffffffa05e226a>] gfs2_inplace_reserve+0xca/0x7e0 [gfs2]
kernel: PGD bf2ae3067 PUD bf2ae4067 PMD 0
kernel: Oops: 0002 [#1] SMP
kernel: last sysfs file: <random sysfs file>
kernel: CPU 2
kernel: Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle autofs4 gfs2 dlm
configfs sunrpc bridge bonding 8021q garp stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter
ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap
macvlan tun kvm_intel kvm hpwdt hpilo bnx2 ch osst st sg microcode serio_raw iTCO_wdt iTCO_vendor_support i5000_edac edac_core
i5k_amb shpchp sd_mod crc_t10dif ext4 mbcache jbd2 hpsa cciss qla2xxx scsi_transport_fc scsi_tgt radeon ttm drm_kms_helper drm
i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
kernel:
kernel: Pid: 7453, comm: nbaudit Not tainted 2.6.32-358.el6.x86_64 #1 HP ProLiant BL460c G1
kernel: RIP: 0010:[<ffffffffa05e226a>]  [<ffffffffa05e226a>] gfs2_inplace_reserve+0xca/0x7e0 [gfs2]
kernel: RSP: 0018:ffff880bf2b6dc58  EFLAGS: 00010202
kernel: RAX: 0000000000000001 RBX: ffff880bf8f0c040 RCX: ffff880c14fcd328
kernel: RDX: 000000000036ff01 RSI: 000000000036be8c RDI: ffff880bf8f0c040
kernel: RBP: ffff880bf2b6dd18 R08: 9050000000000000 R09: f3ef4c1ee58f720a
kernel: R10: 00000000000007de R11: 0000000000000004 R12: 0000000000002000
kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
kernel: FS:  00007ff94f3ea720(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000000060 CR3: 0000000bf2ae2000 CR4: 00000000000007e0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
kernel: Process nbaudit (pid: 7453, threadinfo ffff880bf2b6c000, task ffff880bf29a2ae0)
kernel: Stack:
kernel: ffff880bf2b6ddc8 000000000000000a ffff880bf2b6dc88 ffffffff81096c6f
kernel: <d> ffff880bf2b6dd98 00000007f93f1818 ffff880bf2b6dc98 ffff880c126a8000
kernel: <d> ffff880bf2b6dce8 ffffffffa05ca2a8 ffff880c14fcd328 0000000000000000
kernel: Call Trace:
kernel: [<ffffffff81096c6f>] ? wake_up_bit+0x2f/0x40
kernel: [<ffffffffa05ca2a8>] ? do_promote+0x208/0x330 [gfs2]
kernel: [<ffffffffa05bd06e>] gfs2_setattr_size+0xce/0x210 [gfs2]
kernel: [<ffffffffa05d9534>] gfs2_setattr+0x214/0x330 [gfs2]
kernel: [<ffffffffa05d9366>] ? gfs2_setattr+0x46/0x330 [gfs2]
kernel: [<ffffffff8119e688>] notify_change+0x168/0x340
kernel: [<ffffffff8117f104>] do_truncate+0x64/0xa0
kernel: [<ffffffff81228aeb>] ? selinux_path_truncate+0x7b/0xb0
kernel: [<ffffffff8117f28c>] vfs_truncate+0x14c/0x170
kernel: [<ffffffff8117f30b>] sys_truncate+0x5b/0x70
kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
kernel: Code: 8b 4d 90 48 8b b2 10 03 00 00 8b 51 34 48 8b 41 28 48 01 c2 48 39 d6 0f 92 c2 48 39 c6 0f 93 c0 0f b6 c0 85 d0 0f
84 70 04 00 00 <49> 89 4d 60 48 89 c8 c7 45 9c 01 00 00 00 49 8d 75 08 49 8d 55
kernel: RIP  [<ffffffffa05e226a>] gfs2_inplace_reserve+0xca/0x7e0 [gfs2]
kernel: RSP <ffff880bf2b6dc58>
kernel: CR2: 0000000000000060
kernel: ---[ end trace 2136841198e5d521 ]---
  • Our HA cluster is crashing after we performed a batch of updates last night. We have traced the problem to the courier-imap packages that are installed on our servers. Whenever the service is started on top of the GFS file-systems it crashes like this. The setup has been running for +/- 2years so we're confident something in an update has broken this. I have the an extract from a crash report pasted below.
BUG: unable to handle kernel NULL pointer dereference at 00000034
IP: [<f8625d75>] gfs2_inplace_reserve+0x605/0x8f0 [gfs2]
*pdpt = 000000002acce001 *pde = 00000000b06de067 
Oops: 0002 [#1] SMP 
last sysfs file: <random sysfs file>
Modules linked in: gfs2 dlm configfs iptable_filter ip_tables autofs4 sunrpc dm_round_robin be2iscsi iscsi_boot_sysfs bnx2i cnic
uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp
libiscsi scsi_transport_iscsi cpufreq_ondemand acpi_cpufreq mperf bonding 8021q garp stp llc ipv6 dm_multipath ipmi_devintf igb
ptp pps_core e1000e raid1 microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core
ext4 mbcache jbd2 raid10 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: configfs]

Pid: 9098, comm: maildrop Not tainted 2.6.32-358.el6.i686 #1 Supermicro X8DTT-H/X8DTT-H
EIP: 0060:[<f8625d75>] EFLAGS: 00210282 CPU: 7
EIP is at gfs2_inplace_reserve+0x605/0x8f0 [gfs2]
EAX: efe514b8 EBX: 00000000 ECX: 00000000 EDX: 00f106dd
ESI: 00000001 EDI: 00f0073b EBP: 00000000 ESP: ea3dde08
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process maildrop (pid: 9098, ti=ea3dc000 task=c1591000 task.ti=ea3dc000)
Stack:
 00000001 e8f0f0b8 e8f0f0b8 00000000 00000f50 e8695f50 f86025b7 00001000
<0> e8f0f0b8 00001000 e8695000 e8695ed0 e8f0f0b8 efe514b8 eb8708f0 f24fe000
<0> 00000002 f8603a16 f8602300 00000115 00000000 00000000 f8602300 e7c44b70
Call Trace:
 [<f86025b7>] ? gfs2_dirent_scan+0xf7/0x160 [gfs2]
 [<f8603a16>] ? gfs2_dirent_search+0xf6/0x1c0 [gfs2]
 [<f8602300>] ? gfs2_dirent_find_space+0x0/0x50 [gfs2]
 [<f8602300>] ? gfs2_dirent_find_space+0x0/0x50 [gfs2]
 [<f861bc7b>] ? gfs2_link+0x1cb/0x290 [gfs2]
 [<f861bb0a>] ? gfs2_link+0x5a/0x290 [gfs2]
 [<f861bb23>] ? gfs2_link+0x73/0x290 [gfs2]
 [<c053d688>] ? vfs_link+0xe8/0x160
 [<c0540216>] ? sys_linkat+0xe6/0x110
 [<c05475ec>] ? dput+0x9c/0x110
 [<c0532b94>] ? __fput+0x164/0x1f0
 [<c054cb45>] ? mntput_no_expire+0x15/0xd0
 [<c054026f>] ? sys_link+0x2f/0x40
 [<c04099bf>] ? sysenter_do_call+0x12/0x28
Code: ff 8b 44 24 20 8b b8 d8 01 00 00 8b 88 dc 01 00 00 8b 44 24 3c 89 fa c7 04 24 01 00 00 00 e8 73 d3 ff ff 8b 5c 24 0c 89 44
24 34 <89> 43 34 8b 44 24 34 c7 44 24 30 01 00 00 00 e9 36 fa ff ff 8d

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the Resilient Storage Add On
    • Issue occurs on kernel-2.6.32-358.el6 and kernel-2.6.32-358.0.1.el6 only.
      • See this solution if a very similar issue occurs on kernel-2.6.32-358.2.1.el6
  • GFS2 filesystems
    • Bug is triggered by truncating files on the filesystem.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content