Kernel Panic : “Unable to handle kernel paging request at 0000000000200200" with RIP in list_del+0xb/0x6b
Environment
- Red Hat Enterprise Linux 5.X
- VMware
Issue
- Server reboot with Panic message "Unable to handle kernel paging request at 0000000000XXXXXX RIP: ".
- RIP in function list_del+0xb/0x6b called from vmxnet3_rq_destroy_all function of
vmxnet3
module.
Resolution
An exception occurred in unsigned vmxnet3
kernel module. In RHEL 5.X, vmxnet3
module is provided by VMware. Contact the VMware technical team for further investigation.
Root Cause
The issue is because of dereferencing of an invalid address in RIP list_del() by the function vmxnet3_rq_destroy_all() which is associated with the module vmxnet3
.
Diagnostic Steps
-
Analysis of vmcore
crash> sys | grep -e RELEASE -e PANIC RELEASE: 2.6.18-426.el5 PANIC: "Unable to handle kernel paging request at 0000000000200200" crash> sys -i |head -n 5 dmi_ident[1]: Phoenix Technologies LTD dmi_ident[2]: 6.00 dmi_ident[3]: 09/21/2015 dmi_ident[4]: VMware, Inc. dmi_ident[5]: VMware Virtual Platform
-
Kernel ring buffer
Unable to handle kernel paging request at 0000000000200200 RIP: [<ffffffff80162d7a>] list_del+0xb/0x6b <<<<<<< PGD 8000000383e03067 PUD 5a3573067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/irq CPU 1 Modules linked in: tcp_diag inet_diag nfs nfs_acl netconsole lockd sunrpc be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i libcxgbi cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi vsock(U) vmmemctl(U) acpiphp dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac lp pvscsi(U) sg pcspkr ide_cd serio_raw tpm_tis i2c_piix4 parport_pc floppy parport tpm cdrom tpm_bios i2c_core vmci(U) vmxnet3(U) dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci ata_piix libata shpchp mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 51, comm: events/1 Tainted: G -------------------- 2.6.18-426.el5 #1 RIP: 0010:[<ffffffff80162d7a>] [<ffffffff80162d7a>] list_del+0xb/0x6b <<<<<<< RSP: 0018:ffff81303fb59ad0 EFLAGS: 00010082 RAX: 0000000000000013 RBX: 0000000000000087 RCX: 0000000000200200 RDX: 0000000000000001 RSI: ffff81303d1c2180 RDI: ffff81303d1c2180 RBP: ffff81303d1c2000 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000aba51f9d R11: 0000000000000006 R12: ffff81303fb59b34 R13: ffff81303d1c2500 R14: 0000000000000032 R15: ffff81303f79dcc0 FS: 0000000000000000(0000) GS:ffffffff8043d080(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000200200 CR3: 0000001474426000 CR4: 00000000000006a0 Process events/1 (pid: 51, threadinfo ffff81303fb58000, task ffff81183fa6d830) Stack: ffff81303d1c2500 ffffffff88202e80 ffff81303d1c2500 ffff81303d1c2000 ffff81303f79dcc0 ffff811c4b5a2bc0 ffffffff80511885 ffffffff8024d007 0000000000000019 000000000000002f 0000000000000000 ffffffff80511885 Call Trace: [<ffffffff88202e80>] :vmxnet3:vmxnet3_rq_destroy_all+0x824/0x1138 [<ffffffff8024d007>] netpoll_poll_dev+0xa2/0x36c [<ffffffff8024d3ab>] netpoll_send_skb_on_dev+0xda/0xef [<ffffffff8868f0e1>] :netconsole:write_msg+0x49/0x60 [<ffffffff8009a372>] __call_console_drivers+0x5b/0x69 [<ffffffff800193d4>] release_console_sem+0x143/0x205 [<ffffffff8009ab67>] vprintk+0x2b2/0x317 [<ffffffff80115697>] proc_mkdir_mode+0x4c/0x63 [<ffffffff800c67ee>] register_handler_proc+0x9e/0xb0 [<ffffffff8009ac1e>] printk+0x52/0xbd [<ffffffff800c541e>] setup_irq+0x186/0x1cf [<ffffffff882030c0>] :vmxnet3:vmxnet3_rq_destroy_all+0xa64/0x1138 [<ffffffff800c5517>] request_irq+0xb0/0xd6 [<ffffffff882034ea>] :vmxnet3:vmxnet3_rq_destroy_all+0xe8e/0x1138 [<ffffffff88202825>] :vmxnet3:vmxnet3_rq_destroy_all+0x1c9/0x1138 [<ffffffff88203e80>] :vmxnet3:vmxnet3_activate_dev+0x3c/0x1cc [<ffffffff8820491c>] :vmxnet3:vmxnet3_set_ringsize+0xf8/0x1068 [<ffffffff88204c68>] :vmxnet3:vmxnet3_set_ringsize+0x444/0x1068 [<ffffffff8004fd93>] run_workqueue+0x9e/0xfb [<ffffffff8004c5e6>] worker_thread+0x0/0x122 [<ffffffff8004c6d6>] worker_thread+0xf0/0x122 [<ffffffff80095673>] default_wake_function+0x0/0xe [<ffffffff80034f33>] kthread+0xfe/0x132 [<ffffffff8006bd41>] child_rip+0xa/0x11 [<ffffffff80034e35>] kthread+0x0/0x132 [<ffffffff8006bd37>] child_rip+0x0/0x11
-
Backtrace of the panic task indicates that the panic occurred in the function list_del()
crash> bt PID: 51 TASK: ffff81183fa6d830 CPU: 1 COMMAND: "events/1" #0 [ffff81303fb59830] crash_kexec at ffffffff800b76e9 #1 [ffff81303fb598f0] __die at ffffffff80066eb7 #2 [ffff81303fb59930] do_page_fault at ffffffff80069425 #3 [ffff81303fb59a20] error_exit at ffffffff800668d8 [exception RIP: list_del+11] <<<<<<<< RIP: ffffffff80162d7a RSP: ffff81303fb59ad0 RFLAGS: 00010082 RAX: 0000000000000013 RBX: 0000000000000087 RCX: 0000000000200200 RDX: 0000000000000001 RSI: ffff81303d1c2180 RDI: ffff81303d1c2180 RBP: ffff81303d1c2000 R8: 0000000000000000 R9: 0000000000000000 R10: 00000000aba51f9d R11: 0000000000000006 R12: ffff81303fb59b34 R13: ffff81303d1c2500 R14: 0000000000000032 R15: ffff81303f79dcc0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #4 [ffff81303fb59ad8] vmxnet3_rq_destroy_all at ffffffff88202e80 [vmxnet3] <<<<<<<< #5 [ffff81303fb59bb8] __call_console_drivers at ffffffff8009a372 #6 [ffff81303fb59bd8] release_console_sem at ffffffff800193d4 #7 [ffff81303fb59c08] vprintk at ffffffff8009ab67 #8 [ffff81303fb59c88] printk at ffffffff8009ac1e #9 [ffff81303fb59d78] vmxnet3_rq_destroy_all at ffffffff882034ea [vmxnet3] #10 [ffff81303fb59f48] kernel_thread at ffffffff8006bd41
-
Disassembly of exception RIP: list_del+11
crash> dis -rl ffffffff80162d7a /usr/src/debug/kernel-2.6.18/linux-2.6.18-426.el5.x86_64/lib/list_debug.c: 61 0xffffffff80162d6f <list_del>: sub $0x8,%rsp /usr/src/debug/kernel-2.6.18/linux-2.6.18-426.el5.x86_64/lib/list_debug.c: 62 0xffffffff80162d73 <list_del+4>: mov 0x8(%rdi),%rcx /usr/src/debug/kernel-2.6.18/linux-2.6.18-426.el5.x86_64/lib/list_debug.c: 61 0xffffffff80162d77 <list_del+8>: mov %rdi,%rsi /usr/src/debug/kernel-2.6.18/linux-2.6.18-426.el5.x86_64/lib/list_debug.c: 62 0xffffffff80162d7a <list_del+11>: mov (%rcx),%rdx <<<<<<< Kernel crashed here
-
The corresponding kernel source lib/list_debug.clib/list_debug.c
... 54 /** 55 * list_del - deletes entry from list. 56 * @entry: the element to delete from the list. 57 * Note: list_empty on entry does not return true after this, the entry is 58 * in an undefined state. 59 */ 60 void list_del(struct list_head *entry) 61 { 62 if (unlikely(entry->prev->next != entry)) { .... ^ |_____kernel crashed here
-
The panic occurred while dereferencing the address stored in the register %rcx
0xffffffff80162d7a <list_del+11>: mov (%rcx),%rdx
-
The address stored in the register %rcx at <list_del+11> is 0000000000200200.
-
The address in register %rcx is populated at <list_del+4> from 0x8 offset of %rdi.
0xffffffff80162d73 <list_del+4>: mov 0x8(%rdi),%rcx
-
The address in register %rdi is passed from the function vmxnet3_rq_destroy_all().
crash> dis -rl ffffffff88202e80 | tail -n 3 0xffffffff88202e74 <vmxnet3_rq_destroy_all+2072>: lea 0x180(%rbp),%rdi 0xffffffff88202e7b <vmxnet3_rq_destroy_all+2079>: callq 0xffffffff80162d6f <list_del> 0xffffffff88202e80 <vmxnet3_rq_destroy_all+2084>: lock btrl $0x5,0x40(%rbp) crash> px list_del list_del = $1 = {void (struct list_head *)} 0xffffffff80162d6f <list_del> crash> struct list_head 0xffff81303d1c2180 struct list_head { next = 0x100100, prev = 0x200200 <<<<<<< /* Invalid address */ } crash> px (0xffff81303d1c2180+0x8) $2 = 0xffff81303d1c2188 crash> rd 0xffff81303d1c2188 ffff81303d1c2188: 0000000000200200 ^ |_____Invalid address
-
The function vmxnet3_rq_destroy_all() is the part of an unsigned (U) module
[vmxnet3]
.crash> sym vmxnet3_rq_destroy_all ffffffff8820265c (t) vmxnet3_rq_destroy_all [vmxnet3] crash> mod -t | grep -e NAME -e vmxnet3 NAME LICENSE_GPLOK vmxnet3 40(U) crash> module.state,name,version,srcversion,gpgsig_ok ffffffff8820bc00 state = MODULE_STATE_LIVE name = "vmxnet3\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\..." version = 0xffff81303f780440 "1.4.2.0" srcversion = 0x0 gpgsig_ok = 0
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments