Kernel Panic: unable to handle kernel NULL pointer dereference caused by "vxdmp_submit_bio"

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux 6.2
  • kernel-2.6.32-279.el6
  • Kernel- 2.6.32-642.15.1.el6
  • Veritas Cluster
  • HP - ProLiant DL560 Gen8

Issue

  • Kernel panic with following call traces:
BUG: unable to handle kernel NULL pointer dereference at 00000000000002c0
IP: [<ffffffffa08ecc24>] vxdmp_submit_bio+0x24/0x2f0 [vxdmp]
PGD 2fd942c067 PUD 3007b2c067 PMD 3006052067 PTE 0
Oops: 0000 [#1] SMP

Resolution

  • The kernel panic occurred in "vxdmp" module provided by Veritas.
    Red Hat does not have the source code of unsigned (U) kernel module (vxdmp), hence engage module vendor for further investigation.

Root Cause

  • The panic occurs when vxdmp_submit_bio() attempted to access 0x2c0(%rax) address. However, the RAX register had 0x0 at that moment which cause the panic.

Diagnostic Steps

  • System information:
crash> sys
        CPUS: 64
        DATE: Sun Jun 18 15:42:29 2017
      UPTIME: 42 days, 07:49:15
LOAD AVERAGE: 3.14, 3.65, 3.52    
       TASKS: 6915
    NODENAME:  server1.com      <<<---
     RELEASE: 2.6.32-642.15.1.el6.x86_64
     VERSION: #1 SMP Mon Feb 20 02:26:38 EST 2017
     MACHINE: x86_64  (2394 Mhz)
      MEMORY: 512 GB
       PANIC: "BUG: unable to handle kernel NULL pointer dereference at 00000000000002c0"   <<<----  
  • Following error messages are seen in Kernel ring buffer:
[..]
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 201/0x1950
BUG: unable to handle kernel NULL pointer dereference at 00000000000002c0
IP: [<ffffffffa0145a94>] vxdmp_submit_bio+0x24/0x2f0 [vxdmp]     <<<----
PGD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/online
CPU 56 
Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) dm_snapshot dm_bufio mptctl mptbase autofs4 pcc_cpufreq nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc bonding ipv6 iTCO_wdt iTCO_vendor_support microcode serio_raw hpilo hpwdt lpc_ich mfd_core ioatdma ixgbe dca ptp pps_core mdio sg power_meter acpi_ipmi ipmi_si ipmi_msghandler shpchp ext4 jbd2 mbcache dmpjbod(P)(U) dmpap(P)(U) dmpaa(P)(U) vxspec(P)(U) vxio(P)(U) vxdmp(P)(U) sd_mod hpsa lpfc scsi_transport_fc scsi_tgt crc_t10dif dm_mirror dm_region_hash dm_log dm_mod [last unloaded: oracleoks]
Pid: 98283, comm: flush-253:19 Tainted: P           -- ------------    2.6.32-642.15.1.el6.x86_64 #1 HP ProLiant DL560 Gen8
RIP: 0010:[<ffffffffa0145a94>]  [<ffffffffa0145a94>] vxdmp_submit_bio+0x24/0x2f0 [vxdmp]
RSP: 0018:ffff8860125a7450  EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88685dd9f680 RCX: ffff88685dd9f680
RDX: ffff886008501c00 RSI: 0000000000000286 RDI: ffff88685dd9f680
RBP: ffff8860125a74e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800a45a600
R13: 0000000000000038 R14: ffff88400eb54c00 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8860b0d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000000002c0 CR3: 0000000001a8d000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-253:19 (pid: 98283, threadinfo ffff8860125a4000, task ffff8849a5148040)
Stack:
 ffff88400eb54c00 ffffffff811d6560 ffff8860125a74b0 ffffffff811b8cc9
<d> ffffffffa015d570 ffffffff811d6580 000000000c901910 ffff88685dd9f680
<d> 0000000000000000 0000000000000038 ffff88400eb54c00 ffff884517f61818
Call Trace:
 [<ffffffff811d6560>] ? bdev_test+0x0/0x20
 [<ffffffff811b8cc9>] ? iget5_locked+0x59/0x1b0
 [<ffffffffa015d570>] ? gen_mq_select_path+0x0/0x20 [vxdmp]
 [<ffffffff811d6580>] ? bdev_set+0x0/0x20
 [<ffffffffa0151c9a>] gendmpstrategy+0x35a/0x430 [vxdmp]
 [<ffffffffa0152cbb>] dmpstrategy+0x1b/0x30 [vxdmp]
 [<ffffffff8127dff0>] generic_make_request+0x240/0x5a0
 [<ffffffff81130d85>] ? mempool_alloc_slab+0x15/0x20
 [<ffffffff81130f23>] ? mempool_alloc+0x63/0x140
 [<ffffffff81278240>] ? __elv_add_request+0x40/0x90
 [<ffffffff8127e3c0>] submit_bio+0x70/0x120
 [<ffffffff811cfead>] submit_bh+0x11d/0x1f0
 [<ffffffff811d2678>] __block_write_full_page+0x1c8/0x330
 [<ffffffff811d1630>] ? end_buffer_async_write+0x0/0x190
 [<ffffffffa06032c0>] ? noalloc_get_block_write+0x0/0x60 [ext4]
 [<ffffffffa06032c0>] ? noalloc_get_block_write+0x0/0x60 [ext4]
 [<ffffffff811d28c0>] block_write_full_page_endio+0xe0/0x120
 [<ffffffffa05fea30>] ? ext4_bh_delay_or_unwritten+0x0/0x30 [ext4]
 [<ffffffff811d2915>] block_write_full_page+0x15/0x20
 [<ffffffffa0604b52>] ext4_writepage+0x172/0x450 [ext4]
 [<ffffffffa0604f77>] mpage_da_submit_io+0x147/0x1d0 [ext4]
 [<ffffffff811d1630>] ? end_buffer_async_write+0x0/0x190
 [<ffffffffa06073ee>] mpage_da_map_and_submit+0x17e/0x470 [ext4]
 [<ffffffff812a1095>] ? radix_tree_gang_lookup_tag_slot+0x95/0xe0
 [<ffffffff811d0f86>] ? __set_page_dirty_buffers+0x46/0xc0
 [<ffffffff811513be>] ? __dec_zone_page_state+0x2e/0x30
 [<ffffffffa0607ba8>] write_cache_pages_da+0x3d8/0x470 [ext4]
 [<ffffffffa05d93d5>] ? jbd2_journal_start+0xb5/0x100 [jbd2]
 [<ffffffffa0607f12>] ext4_da_writepages+0x2d2/0x620 [ext4]
 [<ffffffff811439f1>] do_writepages+0x21/0x40
 [<ffffffff811c705d>] writeback_single_inode+0xdd/0x290
 [<ffffffff811c745d>] writeback_sb_inodes+0xbd/0x170
 [<ffffffff811c75bb>] writeback_inodes_wb+0xab/0x1b0
 [<ffffffff811c79b3>] wb_writeback+0x2f3/0x410
 [<ffffffff8108fbb2>] ? del_timer_sync+0x22/0x30
 [<ffffffff811c7c75>] wb_do_writeback+0x1a5/0x240
 [<ffffffff811c7d73>] bdi_writeback_task+0x63/0x1b0
 [<ffffffff810a6727>] ? bit_waitqueue+0x17/0xd0
 [<ffffffff81152bd0>] ? bdi_start_fn+0x0/0x100
 [<ffffffff81152c56>] bdi_start_fn+0x86/0x100
 [<ffffffff81152bd0>] ? bdi_start_fn+0x0/0x100
 [<ffffffff810a640e>] kthread+0x9e/0xc0
 [<ffffffff8100c28a>] child_rip+0xa/0x20
 [<ffffffff810a6370>] ? kthread+0x0/0xc0
 [<ffffffff8100c280>] ? child_rip+0x0/0x20
Code: 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 68 0f 1f 44 00 00 48 8b 47 10 48 89 f9 48 8b 80 90 00 00 00 <48> 8b 80 c0 02 00 00 44 8b b0 34 04 00 00 41 c1 e6 09 45 85 f6 
RIP  [<ffffffffa0145a94>] vxdmp_submit_bio+0x24/0x2f0 [vxdmp]   <<<<----
 RSP <ffff8860125a7450>
CR2: 00000000000002c0
crash> 
  • List of third party modules.
crash> mod -t | grep -i vx
vxdmp       P(U)
vxio        P(U)
vxspec      P(U)
  • Backtrace of panic task:
crash> bt
PID: 98283  TASK: ffff8849a5148040  CPU: 56  COMMAND: "flush-253:19"
 #0 [ffff8860125a7040] machine_kexec at ffffffff8103fdcb
 #1 [ffff8860125a70a0] crash_kexec at ffffffff810d1dc2
 #2 [ffff8860125a7170] oops_end at ffffffff8154d340
 #3 [ffff8860125a71a0] no_context at ffffffff810518cb
 #4 [ffff8860125a71f0] __bad_area_nosemaphore at ffffffff81051b55
 #5 [ffff8860125a7240] bad_area_nosemaphore at ffffffff81051c23
 #6 [ffff8860125a7250] __do_page_fault at ffffffff8105231c
 #7 [ffff8860125a7370] do_page_fault at ffffffff8154f2ce
 #8 [ffff8860125a73a0] page_fault at ffffffff8154c5d5
    [exception RIP: vxdmp_submit_bio+36]      <<<------------ 
    RIP: ffffffffa0145a94  RSP: ffff8860125a7450  RFLAGS: 00010286
    RAX: 0000000000000000  RBX: ffff88685dd9f680  RCX: ffff88685dd9f680
    RDX: ffff886008501c00  RSI: 0000000000000286  RDI: ffff88685dd9f680
    RBP: ffff8860125a74e0   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000001  R12: ffff88800a45a600
    R13: 0000000000000038  R14: ffff88400eb54c00  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff8860125a74e8] gendmpstrategy at ffffffffa0151c9a [vxdmp]    <<<-----  
#10 [ffff8860125a7538] dmpstrategy at ffffffffa0152cbb [vxdmp]       <<<----
#11 [ffff8860125a7548] generic_make_request at ffffffff8127dff0
#12 [ffff8860125a7628] submit_bio at ffffffff8127e3c0
#13 [ffff8860125a7678] submit_bh at ffffffff811cfead
#14 [ffff8860125a76a8] __block_write_full_page at ffffffff811d2678
#15 [ffff8860125a7728] block_write_full_page_endio at ffffffff811d28c0
#16 [ffff8860125a7778] block_write_full_page at ffffffff811d2915
#17 [ffff8860125a7788] ext4_writepage at ffffffffa0604b52 [ext4]
#18 [ffff8860125a77d8] mpage_da_submit_io at ffffffffa0604f77 [ext4]
#19 [ffff8860125a78c8] mpage_da_map_and_submit at ffffffffa06073ee [ext4]
#20 [ffff8860125a79a8] write_cache_pages_da at ffffffffa0607ba8 [ext4]
#21 [ffff8860125a7ac8] ext4_da_writepages at ffffffffa0607f12 [ext4]
#22 [ffff8860125a7bc8] do_writepages at ffffffff811439f1
#23 [ffff8860125a7bd8] writeback_single_inode at ffffffff811c705d
#24 [ffff8860125a7c18] writeback_sb_inodes at ffffffff811c745d
#25 [ffff8860125a7c78] writeback_inodes_wb at ffffffff811c75bb
#26 [ffff8860125a7cd8] wb_writeback at ffffffff811c79b3
#27 [ffff8860125a7dd8] wb_do_writeback at ffffffff811c7c75
#28 [ffff8860125a7e68] bdi_writeback_task at ffffffff811c7d73
#29 [ffff8860125a7eb8] bdi_start_fn at ffffffff81152c56
#30 [ffff8860125a7ee8] kthread at ffffffff810a640e
#31 [ffff8860125a7f48] kernel_thread at ffffffff8100c28a
  • Alternatively following call traces are also seen in panic task back trace.
PID: 23022 TASK: ffff882ff77b7500 CPU: 0 COMMAND: "vxconfigd" 
#0 [ffff882fd7531440] machine_kexec at ffffffff8103281b 
#1 [ffff882fd75314a0] crash_kexec at ffffffff810ba662 
#2 [ffff882fd7531570] oops_end at ffffffff81501290 
#3 [ffff882fd75315a0] no_context at ffffffff81043bab 
#4 [ffff882fd75315f0] __bad_area_nosemaphore at ffffffff81043e35 
#5 [ffff882fd7531640] bad_area at ffffffff81043f5e 
#6 [ffff882fd7531670] __do_page_fault at ffffffff81044710 
#7 [ffff882fd7531790] do_page_fault at ffffffff8150326e 
#8 [ffff882fd75317c0] page_fault at ffffffff81500625
     [exception RIP: vxdmp_submit_bio+0x24] 
RIP: ffffffffa08ecc24 RSP: ffff882fd7531878 RFLAGS: 00010296 
RAX: 0000000000000000 RBX: ffff883004dd7800 RCX: ffff883004dd7800 
RDX: ffff885fde900000 RSI: 0000000000000286 RDI: ffff883004dd7800 
RBP: ffff882fd7531908 R8: 0000000000000000 R9: ffff886026fe2400 
R10: 0000000000000100 R11: 0000000051fe600f R12: ffff8830259dce00 
R13: 0000000000000000 R14: ffff882fd9c43800 R15: 00000000000000e0 
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff882fd7531910] gendmpstrategy at ffffffffa08f8abf [vxdmp] 
#10 [ffff882fd7531960] dmpstrategy at ffffffffa08f9adb [vxdmp]
#11 [ffff882fd7531970] generic_make_request at ffffffff81256dbe 
#12 [ffff882fd7531a40] vxvm_submit_diskio at ffffffffa0a104c4 [vxio] 
#13 [ffff882fd7531ae0] voldmp_strategy at ffffffffa0a0850b [vxio] 
#14 [ffff882fd7531af0] vol_dev_strategy at ffffffffa0a0852b [vxio] 
#15 [ffff882fd7531b00] voldiosio_start at ffffffffa0a1205f [vxio] 
#16 [ffff882fd7531b70] volkcontext_process at ffffffffa0a3092a [vxio] 
#17 [ffff882fd7531bc0] volsiowait at ffffffffa0a63802 [vxio] 
#18 [ffff882fd7531c50] voldio at ffffffffa0aa53c7 [vxio]
#19 [ffff882fd7531d60] vol_voldio_write at ffffffffa0aa5709 [vxio] 
#20 [ffff882fd7531d70] volconfig_ioctl at ffffffffa0aa635f [vxio]
#21 [ffff882fd7531db0] volsioctl_real at ffffffffa0aaea18 [vxio]
#22 [ffff882fd7531e90] vols_ioctl at ffffffffa0137126 [vxspec] 
#23 [ffff882fd7531eb0] vols_compat_ioctl at ffffffffa013734d [vxspec] 
#24 [ffff882fd7531ee0] compat_sys_ioctl at ffffffff811cddad 
#25 [ffff882fd7531f80] sysenter_dispatch at ffffffff8104a820
  • Dis-assembly of exception frame:
crash> dis -lr vxdmp_submit_bio+0x24 | tail -20
0xffffffffa0145a70 <vxdmp_submit_bio>:  push   %rbp
0xffffffffa0145a71 <vxdmp_submit_bio+1>:    mov    %rsp,%rbp
0xffffffffa0145a74 <vxdmp_submit_bio+4>:    push   %r15
0xffffffffa0145a76 <vxdmp_submit_bio+6>:    push   %r14
0xffffffffa0145a78 <vxdmp_submit_bio+8>:    push   %r13
0xffffffffa0145a7a <vxdmp_submit_bio+10>:   push   %r12
0xffffffffa0145a7c <vxdmp_submit_bio+12>:   push   %rbx
0xffffffffa0145a7d <vxdmp_submit_bio+13>:   sub    $0x68,%rsp
0xffffffffa0145a81 <vxdmp_submit_bio+17>:   nopl   0x0(%rax,%rax,1)
0xffffffffa0145a86 <vxdmp_submit_bio+22>:   mov    0x10(%rdi),%rax
0xffffffffa0145a8a <vxdmp_submit_bio+26>:   mov    %rdi,%rcx
0xffffffffa0145a8d <vxdmp_submit_bio+29>:   mov    0x90(%rax),%rax
0xffffffffa0145a94 <vxdmp_submit_bio+36>:   mov    0x2c0(%rax),%rax

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.