System crash in qla2x00_status_entry or blk_{requeue,start,finish}_request after qla2xxx connection issues or aborts
Issue
-
System crash in
qla2x00_status_entry
orblk_{requeue,start,finish}_request()
afterqla2xxx
connection issues or aborts.- Connection and abort issues, for example:
Abort command issued nexus
remote port time out: removing target and saving binding
- The
RIP
symbol reference within the callback tree includes one of the following:qla2x00_status_entry
blk_requeue_request
blk_start_request
blk_finish_request
- Example 1 - Completing I/O panic in qla2x00_status_entry post abort
[ 67.904651] qla2xxx [0000:19:00.0]-801c:2: Abort command issued nexus=2:1:0 -- 2002. .... [ 341.677920] BUG: unable to handle kernel NULL pointer dereference at 0000000000000084 [ 341.678018] IP: [
] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx] [ 341.678136] PGD 0 [ 341.678166] Oops: 0000 [#1] SMP .... [ 341.678749] RIP: 0010:[ ] [ ] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx] ... [ 341.689522] Call Trace: [ 341.690306] [ 341.690318] [ ] ? ttwu_do_wakeup+0x19/0xe0 [ 341.691887] [ ] qla24xx_process_response_queue+0x4b6/0x8d0 [qla2xxx] [ 341.692691] [ ] ? __wake_up_common_lock+0x91/0xc0 [ 341.693475] [ ] ? task_rq_unlock+0x20/0x20 [ 341.694264] [ ] qla24xx_msix_rsp_q+0x4b/0xc0 [qla2xxx] [ 341.695046] [ ] __handle_irq_event_percpu+0x44/0x1c0 [ 341.695887] [ ] handle_irq_event_percpu+0x32/0x80 [ 341.696696] [ ] handle_irq_event+0x3c/0x60 [ 341.697444] [ ] handle_edge_irq+0x7f/0x150 [ 341.698187] [ ] handle_irq+0xe4/0x1a0 [ 341.698939] [ ] ? tick_check_idle+0x8c/0xd0 [ 341.699674] [ ] do_IRQ+0x4d/0xf0 - Example 2 - Completing I/O panic in qla2x00_status_entry post connection loss
[66744.939657] rport-13:0-0: blocked FC remote port time out: removing target and saving binding [67695.275972] rport-11:0-30: blocked FC remote port time out: removing target and saving binding ... [68270.351475] kobject_add_internal failed for 11:0:30:0 (error: -2 parent: target11:0:30) [68270.351902] scsi 11:0:30:0: failed to add device: -2 [68272.291547] BUG: unable to handle kernel NULL pointer dereference at 0000000000000084 [68272.291977] IP: [
] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx] ... [68272.298823] RIP: 0010:[ ] [ ] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx] ... [68272.306618] Call Trace: [68272.307323] [68272.307340] [ ] ? qla2x00_async_event+0x24f/0x1b80 [qla2xxx] [68272.308754] [ ] qla24xx_process_response_queue+0x4b6/0x8d0 [qla2xxx] [68272.309488] [ ] ? scsi_io_completion+0x168/0x720 [68272.310235] [ ] qla24xx_msix_rsp_q+0x4b/0xc0 [qla2xxx] [68272.310989] [ ] __handle_irq_event_percpu+0x44/0x1c0 [68272.311751] [ ] handle_irq_event_percpu+0x32/0x80 [68272.312521] [ ] handle_irq_event+0x3c/0x60 [68272.313291] [ ] handle_edge_irq+0x7f/0x150 [68272.314066] [ ] handle_irq+0xe4/0x1a0 [68272.314845] [ ] ? tick_check_idle+0x8c/0xd0 [68272.315633] [ ] do_IRQ+0x4d/0xf0 [68272.316425] [ ] common_interrupt+0x16a/0x16a - Example 3 - Requeue I/O panic in blk_requeue_request post abort
[11559.256216] qla2xxx [0000:81:00.0]-801c:13:Abort command issued nexus=13:0:3 -- 2002. [11559.256541] qla2xxx [0000:81:00.0]-801c:13: Abort command issued nexus=13:0:3 -- 2002. .... [11559.257229] sd 13:0:0:3: [sdw] tag#7 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=20s [11559.257230] sd 13:0:0:3: [sdw] tag#7 CDB: Test Unit Ready 00 00 00 00 00 00 [11559.257237] ------------[ cut here ]------------ [11559.257258] kernel BUG at block/blk-core.c:1559! [11559.257272] invalid opcode: 0000 [#1] SMP .... [11559.257605] RIP: 0010:[
] [ ] blk_requeue_request+0x90/0xa0 ... [11559.257789] Call Trace: [11559.257798] [11559.257808] [ ] __scsi_queue_insert+0xb6/0x100 [11559.257826] [ ] scsi_softirq_done+0xda/0x160 [11559.258554] [ ] blk_done_softirq+0x96/0xc0 [11559.259238] [ ] __do_softirq+0xf5/0x280 [11559.259910] [ ] call_softirq+0x1c/0x30 [11559.260574] [ ] do_softirq+0x65/0xa0 .... [11559.268980] RIP [ ] blk_requeue_request+0x90/0xa0 [11559.269620] RSP - Example 4 - Requeue I/O panic in blk_start_request post abort
[7698525.714021] qla2xxx [0000:05:00.1]-801c:14: Abort command issued nexus=14:0:1 -- 2002. [7698525.714039] qla2xxx [0000:05:00.1]-801c:14: Abort command issued nexus=14:0:26 -- 2002. .... [7698525.714763] sd 14:0:0:26: [sdbb] tag#30 FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=30s [7698525.714764] sd 14:0:0:26: [sdbb] tag#30 CDB: Write(16) 8a 00 00 00 00 00 89 02 99 30 00 00 01 00 00 00 [7698525.714765] blk_update_request: I/O error, dev sdbb, sector 2298648880 [7698525.721041] ------------[ cut here ]------------ [7698525.721422] kernel BUG at block/blk-core.c:2781! [7698525.721722] invalid opcode: 0000 [#1] SMP .... [7698525.727048] Workqueue: kblockd scsi_requeue_run_queue [7698525.727562] task: ffff9169bc74b180 ti: ffff916a33ba0000 task.ti: ffff916a33ba0000 [7698525.728096] RIP: 0010:[
] [ ] blk_start_request+0x45/0x50 .... [7698525.734001] Call Trace: [7698525.734618] [ ] blk_queue_start_tag+0x11e/0x1d0 [7698525.735302] [ ] ? __scsi_queue_insert+0xd2/0x100 [7698525.735944] [ ] scsi_request_fn+0x23b/0x680 [7698525.736591] [ ] __blk_run_queue+0x39/0x50 [7698525.737243] [ ] blk_run_queue+0x26/0x40 [7698525.737913] [ ] scsi_run_queue+0x258/0x2f0 [7698525.738587] [ ] scsi_requeue_run_queue+0x15/0x20 [7698525.739256] [ ] process_one_work+0x17f/0x440 [7698525.739934] [ ] worker_thread+0x126/0x3c0 - Connection and abort issues, for example:
Environment
- Red Hat Enterprise Linux 7
kernel-3.10.0-1127.el7
or newer
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.