System crash in qla2x00_status_entry or blk_{requeue,start,finish}_request after qla2xxx connection issues or aborts

Solution Verified - Updated -

Issue

  • System crash in qla2x00_status_entry or blk_{requeue,start,finish}_request() after qla2xxx connection issues or aborts.

    • Connection and abort issues, for example:
      • Abort command issued nexus
      • remote port time out: removing target and saving binding
    • The RIP symbol reference within the callback tree includes one of the following:
      • qla2x00_status_entry
      • blk_requeue_request
      • blk_start_request
      • blk_finish_request
    • Example 1 - Completing I/O panic in qla2x00_status_entry post abort
    
    [   67.904651] qla2xxx [0000:19:00.0]-801c:2: Abort command issued nexus=2:1:0 -- 2002.
    ....
    [  341.677920] BUG: unable to handle kernel NULL pointer dereference at 0000000000000084
    [  341.678018] IP: [] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx]
    [  341.678136] PGD 0 
    [  341.678166] Oops: 0000 [#1] SMP 
    ....
    [  341.678749] RIP: 0010:[]  [] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx]
    ...
    [  341.689522] Call Trace:
    [  341.690306]   
    [  341.690318]  [] ? ttwu_do_wakeup+0x19/0xe0
    [  341.691887]  [] qla24xx_process_response_queue+0x4b6/0x8d0 [qla2xxx]
    [  341.692691]  [] ? __wake_up_common_lock+0x91/0xc0
    [  341.693475]  [] ? task_rq_unlock+0x20/0x20
    [  341.694264]  [] qla24xx_msix_rsp_q+0x4b/0xc0 [qla2xxx]
    [  341.695046]  [] __handle_irq_event_percpu+0x44/0x1c0
    [  341.695887]  [] handle_irq_event_percpu+0x32/0x80
    [  341.696696]  [] handle_irq_event+0x3c/0x60
    [  341.697444]  [] handle_edge_irq+0x7f/0x150
    [  341.698187]  [] handle_irq+0xe4/0x1a0
    [  341.698939]  [] ? tick_check_idle+0x8c/0xd0
    [  341.699674]  [] do_IRQ+0x4d/0xf0
    
    • Example 2 - Completing I/O panic in qla2x00_status_entry post connection loss
    
    [66744.939657]  rport-13:0-0: blocked FC remote port time out: removing target and saving binding
    [67695.275972]  rport-11:0-30: blocked FC remote port time out: removing target and saving binding
    ...
    [68270.351475] kobject_add_internal failed for 11:0:30:0 (error: -2 parent: target11:0:30)
    [68270.351902] scsi 11:0:30:0: failed to add device: -2
    [68272.291547] BUG: unable to handle kernel NULL pointer dereference at 0000000000000084
    [68272.291977] IP: [] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx]
    ...
    [68272.298823] RIP: 0010:[]  [] qla2x00_status_entry+0x4f7/0x1900 [qla2xxx]
    ...
    [68272.306618] Call Trace:
    [68272.307323]  
    [68272.307340]  [] ? qla2x00_async_event+0x24f/0x1b80 [qla2xxx]
    [68272.308754]  [] qla24xx_process_response_queue+0x4b6/0x8d0 [qla2xxx]
    [68272.309488]  [] ? scsi_io_completion+0x168/0x720
    [68272.310235]  [] qla24xx_msix_rsp_q+0x4b/0xc0 [qla2xxx]
    [68272.310989]  [] __handle_irq_event_percpu+0x44/0x1c0
    [68272.311751]  [] handle_irq_event_percpu+0x32/0x80
    [68272.312521]  [] handle_irq_event+0x3c/0x60
    [68272.313291]  [] handle_edge_irq+0x7f/0x150
    [68272.314066]  [] handle_irq+0xe4/0x1a0
    [68272.314845]  [] ? tick_check_idle+0x8c/0xd0
    [68272.315633]  [] do_IRQ+0x4d/0xf0
    [68272.316425]  [] common_interrupt+0x16a/0x16a
    
    • Example 3 - Requeue I/O panic in blk_requeue_request post abort
    
    [11559.256216] qla2xxx [0000:81:00.0]-801c:13:Abort command issued nexus=13:0:3 -- 2002.
    [11559.256541] qla2xxx [0000:81:00.0]-801c:13: Abort command issued nexus=13:0:3 -- 2002.
    ....
    [11559.257229] sd 13:0:0:3: [sdw] tag#7 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK cmd_age=20s
    [11559.257230] sd 13:0:0:3: [sdw] tag#7 CDB: Test Unit Ready 00 00 00 00 00 00
    [11559.257237] ------------[ cut here ]------------
    [11559.257258] kernel BUG at block/blk-core.c:1559!
    [11559.257272] invalid opcode: 0000 [#1] SMP 
    ....
    [11559.257605] RIP: 0010:[]  [] blk_requeue_request+0x90/0xa0
    ...
    [11559.257789] Call Trace:
    [11559.257798]   
    [11559.257808]  [] __scsi_queue_insert+0xb6/0x100
    [11559.257826]  [] scsi_softirq_done+0xda/0x160
    [11559.258554]  [] blk_done_softirq+0x96/0xc0
    [11559.259238]  [] __do_softirq+0xf5/0x280
    [11559.259910]  [] call_softirq+0x1c/0x30
    [11559.260574]  [] do_softirq+0x65/0xa0
    ....
    [11559.268980] RIP  [] blk_requeue_request+0x90/0xa0
    [11559.269620]  RSP 
    
    • Example 4 - Requeue I/O panic in blk_start_request post abort
    
    [7698525.714021] qla2xxx [0000:05:00.1]-801c:14: Abort command issued nexus=14:0:1 -- 2002.
    [7698525.714039] qla2xxx [0000:05:00.1]-801c:14: Abort command issued nexus=14:0:26 -- 2002.
    ....
    [7698525.714763] sd 14:0:0:26: [sdbb] tag#30 FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK cmd_age=30s
    [7698525.714764] sd 14:0:0:26: [sdbb] tag#30 CDB: Write(16) 8a 00 00 00 00 00 89 02 99 30 00 00 01 00 00 00
    [7698525.714765] blk_update_request: I/O error, dev sdbb, sector 2298648880
    [7698525.721041] ------------[ cut here ]------------
    [7698525.721422] kernel BUG at block/blk-core.c:2781!
    [7698525.721722] invalid opcode: 0000 [#1] SMP 
    ....
    [7698525.727048] Workqueue: kblockd scsi_requeue_run_queue
    [7698525.727562] task: ffff9169bc74b180 ti: ffff916a33ba0000 task.ti: ffff916a33ba0000
    [7698525.728096] RIP: 0010:[]  [] blk_start_request+0x45/0x50
    ....
    [7698525.734001] Call Trace:
    [7698525.734618]  [] blk_queue_start_tag+0x11e/0x1d0
    [7698525.735302]  [] ? __scsi_queue_insert+0xd2/0x100
    [7698525.735944]  [] scsi_request_fn+0x23b/0x680
    [7698525.736591]  [] __blk_run_queue+0x39/0x50
    [7698525.737243]  [] blk_run_queue+0x26/0x40
    [7698525.737913]  [] scsi_run_queue+0x258/0x2f0
    [7698525.738587]  [] scsi_requeue_run_queue+0x15/0x20
    [7698525.739256]  [] process_one_work+0x17f/0x440
    [7698525.739934]  [] worker_thread+0x126/0x3c0
    

Environment

  • Red Hat Enterprise Linux 7
  • kernel-3.10.0-1127.el7 or newer

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content