RHEL6.3: Dell poweredge crashes with RIP dio_new_bio during direct IO write on ext4 filesystem

Solution Unverified - Updated -

Issue

  • 5 or 6 servers rebooted. All affected nodes are all running the same kernel and located in different locations, 2 in the same datacenter and the others at another location.
  • Kernel crashes in dio_new_bio with the following message
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff811b58ff>] dio_new_bio+0x6f/0x130
PGD 869b76067 PUD afac3f067 PMD 0 
Oops: 0002 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:09.0/0000:07:00.0/host2/rport-2:0-1/target2:0:0/2:0:0:10/state
CPU 0 
Modules linked in: dm_round_robin dm_multipath ipmi_si mpt2sas scsi_transport_sas raid_class mptctl mptbase dell_rbu ipmi_devintf ipmi_msghandler bonding 8021q garp stp llc ipv6 power_meter dcdbas microcode serio_raw iTCO_wdt iTCO_vendor_support i7core_edac edac_core ses enclosure sg bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix qla2xxx scsi_transport_fc scsi_tgt megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ipmi_si]

Pid: 14578, comm: db2sysc Not tainted 2.6.32-279.2.1.el6.x86_64 #1 Dell Inc. PowerEdge R710/00NH4P
RIP: 0010:[<ffffffff811b58ff>]  [<ffffffff811b58ff>] dio_new_bio+0x6f/0x130
RSP: 0018:ffff88080203d958  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8809d103d400 RCX: 0000000000000003
RDX: ffffffff811b5f70 RSI: ffff88084f68dc80 RDI: 0000000000000282
RBP: ffff88080203d988 R08: 00000000000e6d33 R09: 00000000000e6d32
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000e6d32
R13: 000000000000000c R14: ffff88054e7300c0 R15: 00000000000e6d33
FS:  00007fff9ebfe700(0000) GS:ffff880645400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000885f93000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process db2sysc (pid: 14578, threadinfo ffff88080203c000, task ffff880b066e0ae0)
Stack:
 0000000000000008 ffff8809d103d400 ffffea001602a880 0000000000001000
<d> 0000000000000000 00000000000e6d33 ffff88080203d9a8 ffffffff811b5c3f
<d> ffff8809d103d498 ffff8809d103d400 ffff88080203d9f8 ffffffff811b5ce1
Call Trace:
 [<ffffffff811b5c3f>] dio_send_cur_page+0x7f/0xc0
 [<ffffffff811b5ce1>] submit_page_section+0x61/0x140
 [<ffffffffa01433fb>] ? ext4_get_block_dio_write+0xab/0xd0 [ext4]
 [<ffffffff811b65ae>] __blockdev_direct_IO_newtrunc+0x53e/0xb90
 [<ffffffff811b6c5e>] __blockdev_direct_IO+0x5e/0xd0
 [<ffffffffa0143350>] ? ext4_get_block_dio_write+0x0/0xd0 [ext4]
 [<ffffffffa013f8f0>] ? ext4_end_io_dio+0x0/0x120 [ext4]
 [<ffffffffa010f9f6>] ? jbd2_journal_stop+0x1e6/0x2b0 [jbd2]
 [<ffffffffa014244e>] ext4_direct_IO+0x11e/0x310 [ext4]
 [<ffffffffa0143350>] ? ext4_get_block_dio_write+0x0/0xd0 [ext4]
 [<ffffffffa013f8f0>] ? ext4_end_io_dio+0x0/0x120 [ext4]
 [<ffffffff81114e62>] generic_file_direct_write+0xc2/0x190
 [<ffffffff8125a8a7>] ? __blkdev_issue_flush+0xd7/0xe0
 [<ffffffff81116675>] __generic_file_aio_write+0x345/0x480
 [<ffffffffa013c3cd>] ? ext4_sync_file+0x11d/0x260 [ext4]
 [<ffffffff811aa117>] ? vfs_fsync_range+0xb7/0xe0
 [<ffffffff8111681f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa013c131>] ext4_file_write+0x61/0x1e0 [ext4]
 [<ffffffff8117ae9a>] do_sync_write+0xfa/0x140
 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81144ad0>] ? unmap_region+0x110/0x130
 [<ffffffff81213266>] ? security_file_permission+0x16/0x20
 [<ffffffff8117b198>] vfs_write+0xb8/0x1a0
 [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8117bc72>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Code: f6 44 39 f0 0f 4e f0 85 f6 0f 8e d0 00 00 00 bf d0 00 00 00 4c 8b b3 c8 00 00 00 e8 8c cd ff ff 41 8d 4d f7 48 c7 c2 70 5f 1b 81 <4c> 89 70 10 49 d3 e4 48 c7 c1 c0 59 1b 81 4c 89 20 44 8b 83 48 
RIP  [<ffffffff811b58ff>] dio_new_bio+0x6f/0x130
 RSP <ffff88080203d958>
CR2: 0000000000000010

Environment

  • Red Hat Enterprise Linux 6.3
    • kernel 2.6.32-279.2.1.el6
  • ext4 filesystem
  • Hardware
    • Mixture of Dell Systems: PowerEdge R710, R620, and a 1950.
    • All systems are database servers connected to back-end storage with qlogic hbas (mix of 4gb and 8gb cards)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content