"BUG: unable to handle kernel paging request at 0000000108aab700" during scsi error handling
Issue
- SAN issues causing kernel to crash with the following stack trace in the vmcore
crash> bt
PID: 615 TASK: ffff89ddf4bd4f10 CPU: 0 COMMAND: "scsi_eh_14"
#0 [ffff89be356d3978] machine_kexec at ffffffffae460b2a
#1 [ffff89be356d39d8] __crash_kexec at ffffffffae513402
#2 [ffff89be356d3aa8] crash_kexec at ffffffffae5134f0
#3 [ffff89be356d3ac0] oops_end at ffffffffaeb17768
#4 [ffff89be356d3ae8] no_context at ffffffffaeb06f98
#5 [ffff89be356d3b38] __bad_area_nosemaphore at ffffffffaeb0702f
#6 [ffff89be356d3b88] bad_area_nosemaphore at ffffffffaeb071a0
#7 [ffff89be356d3b98] __do_page_fault at ffffffffaeb1a720
#8 [ffff89be356d3c00] do_page_fault at ffffffffaeb1a915
#9 [ffff89be356d3c30] page_fault at ffffffffaeb16768
[exception RIP: qla2x00_eh_wait_on_command+26]
RIP: ffffffffc04ec76a RSP: ffff89be356d3ce8 RFLAGS: 00010286
RAX: 0000000108aab700 RBX: ffff89d065228000 RCX: ffff89ddfbd7b740
RDX: 00000000000004e3 RSI: ffff89d065228a80 RDI: ffff89d065228000
RBP: ffff89be356d3cf0 R8: ffff89da09b6d080 R9: 0000000000004000
R10: ffff89ddf4857740 R11: 00000000000007d3 R12: 0000000000000000
R13: ffff89ddfbd7b740 R14: 0000000000000000 R15: ffff89d065228000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#10 [ffff89be356d3cf8] qla2x00_eh_wait_for_pending_commands at ffffffffc04f027a [qla2xxx]
#11 [ffff89be356d3d40] qla2xxx_eh_bus_reset at ffffffffc04f768a [qla2xxx]
#12 [ffff89be356d3d80] scsi_try_bus_reset at ffffffffae89d406
#13 [ffff89be356d3da8] scsi_eh_ready_devs at ffffffffae89f2f1
#14 [ffff89be356d3e30] scsi_error_handler at ffffffffae8a0acc
#15 [ffff89be356d3ec8] kthread at ffffffffae4bae31
OR , In another crash with similar race condition the stack traces were
crash> bt
PID: 618 TASK: ffff9f81fa340000 CPU: 0 COMMAND: "scsi_eh_14"
#0 [ffff9f81fbd03968] machine_kexec at ffffffffa3060b2a
#1 [ffff9f81fbd039c8] __crash_kexec at ffffffffa3113402
#2 [ffff9f81fbd03a98] crash_kexec at ffffffffa31134f0
#3 [ffff9f81fbd03ab0] oops_end at ffffffffa3717768
#4 [ffff9f81fbd03ad8] no_context at ffffffffa3706f98
#5 [ffff9f81fbd03b28] __bad_area_nosemaphore at ffffffffa370702f
#6 [ffff9f81fbd03b78] bad_area_nosemaphore at ffffffffa37071a0
#7 [ffff9f81fbd03b88] __do_page_fault at ffffffffa371a720
#8 [ffff9f81fbd03bf0] do_page_fault at ffffffffa371a915
#9 [ffff9f81fbd03c20] page_fault at ffffffffa3716768
[exception RIP: qla2x00_eh_wait_for_pending_commands+170]
RIP: ffffffffc038324a RSP: ffff9f81fbd03cd8 RFLAGS: 00010046
RAX: 0000000000000286 RBX: ffff9f61f492b040 RCX: 0000000000000000
RDX: 0000000000000338 RSI: 00000000000007d3 RDI: 0000000000000000
RBP: ffff9f81fbd03d10 R8: ffff9f61fc4f7780 R9: ffffffffa31969c7
R10: ffff9f61f450c000 R11: 00000000000007d3 R12: 0000000000000003
R13: ffff9f61f492d740 R14: 0000000000000002 R15: ffff9f5f3ab68a80
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#10 [ffff9f81fbd03d18] __qla2xxx_eh_generic_reset at ffffffffc038375b [qla2xxx]
#11 [ffff9f81fbd03d80] qla2xxx_eh_device_reset at ffffffffc038390f [qla2xxx]
#12 [ffff9f81fbd03d90] scsi_try_bus_device_reset at ffffffffa349ce6d
#13 [ffff9f81fbd03da8] scsi_eh_ready_devs at ffffffffa349f06f
#14 [ffff9f81fbd03e30] scsi_error_handler at ffffffffa34a0acc
#15 [ffff9f81fbd03ec8] kthread at ffffffffa30bae31
crash>
Environment
- Red Hat Enterprise Linux 7
- kernel - 3.10.0-862.el7
- Qlogic HBA
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.