qla2xxx crash in qla2xxx_pci_slot_reset() handling recoverable PCI error

Solution Verified - Updated -

Issue

Hardware problem with a Fibre Channel card controlled by qla2xxx driver triggers a system panic in qla2xxx_pci_slot_reset()

The console messages leading up to the panic:

[695071.936118] qla2xxx [0008:81:00.0]-015b:22: Disabling adapter.         <---|--2 adapters are being disabled
[695072.120124] qla2xxx [0008:81:00.1]-015b:23: Disabling adapter.          <---|
[695077.007624] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 6
[695077.016964] {1}[Hardware Error]: event severity: recoverable
[695077.023383] {1}[Hardware Error]:  Error 0, type: recoverable
[695077.029802] {1}[Hardware Error]:   section_type: PCIe error
[695077.036124] {1}[Hardware Error]:   port_type: 4, root port
[695077.042349] {1}[Hardware Error]:   version: 3.0
[695077.047507] {1}[Hardware Error]:   command: 0x0547, status: 0x4010
[695077.054507] {1}[Hardware Error]:   device_id: 0008:80:00.0
[695077.060731] {1}[Hardware Error]:   slot: 3
[695077.065403] {1}[Hardware Error]:   secondary_bus: 0x81
[695077.071236] {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2030
[695077.078718] {1}[Hardware Error]:   class_code: 000406
[695077.084448] {1}[Hardware Error]:   bridge: secondary_status: 0x6000, control: 0x0003
[695077.095259] pcieport 0008:80:00.0: aer_status: 0x00000000, aer_mask: 0x00000000
[695077.103531] pcieport 0008:80:00.0: aer_layer=Transaction Layer, aer_agent=Receiver ID
[695077.112432] pcieport 0008:80:00.0: aer_uncor_severity: 0x000e7030
[695077.119355] pcieport 0008:80:00.0: broadcast error_detected message
[695077.120564] qla2xxx [0008:81:00.1]-ffff:23: PCI device is disabled,state 2
...
[695078.132086] pcieport 0008:80:00.0: downstream link has been reset
[695078.132091] pcieport 0008:80:00.0: broadcast slot_reset message
[695078.206048] qla2xxx 0008:81:00.0: Refused to change power state, currently i
n D3
[695078.206403] qla2xxx [0008:81:00.0]-00c7:22: MSI-X: Failed to enable support,
 giving   up -- 16/-22.
[695078.206405] qla2xxx [0008:81:00.0]-0037:22: Falling back-to MSI mode --22.
[695078.206406] qla2xxx [0008:81:00.0]-0039:22: Falling back-to INTa mode -- -22.
[695078.206755] qla2xxx 0008:81:00.0: cache line size of 64 is not supported
[695078.206765] qla2xxx [0008:81:00.0]-2009:22: Unable to get host loop ID.
[695078.279127] qla2xxx 0008:81:00.1: Refused to change power state, currently in D3
[695078.279398] BUG: unable to handle kernel NULL pointer dereference at           (null)

The kernel panic stack trace:

crash> bt
PID: 323078  TASK: ffff89d9dcac2080  CPU: 0   COMMAND: "kworker/0:5"
 #0 [ffff89da09cef970] machine_kexec at ffffffff97263b34
 #1 [ffff89da09cef9d0] __crash_kexec at ffffffff9731e242
 #2 [ffff89da09cefaa0] crash_kexec at ffffffff9731e330
 #3 [ffff89da09cefab8] oops_end at ffffffff9796e778
 #4 [ffff89da09cefae0] no_context at ffffffff9795cdfe
 #5 [ffff89da09cefb30] __bad_area_nosemaphore at ffffffff9795ce95
 #6 [ffff89da09cefb80] bad_area_nosemaphore at ffffffff9795d006
 #7 [ffff89da09cefb90] __do_page_fault at ffffffff979716d0
 #8 [ffff89da09cefc00] do_page_fault at ffffffff97971925
 #9 [ffff89da09cefc30] page_fault at ffffffff9796d768
    [exception RIP: qla2xxx_pci_slot_reset+0xd3]
    RIP: ffffffffc0d4c613  RSP: ffff89da09cefce0  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff8b5a13129000  RCX: 0000000000000000
    RDX: 00000000000000fe  RSI: 0000000000000286  RDI: ffff9466f946b000
    RBP: ffff89da09cefd00   R8: 0000000000000002   R9: 000000000000fbff
    R10: 0000000000000001  R11: fffffcd5513f9c80  R12: ffff9466f9469740
    R13: ffff9466f946b000  R14: ffff89da18129800  R15: ffff89da18129800
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#10 [ffff89da09cefd08] report_slot_reset at ffffffff975d8f76
#11 [ffff89da09cefd30] pci_walk_bus at ffffffff975bb7cb
#12 [ffff89da09cefd68] broadcast_error_message at ffffffff975d86b0
#13 [ffff89da09cefda0] do_recovery at ffffffff975d8899
#14 [ffff89da09cefde0] aer_recover_work_func at ffffffff975d8a1c
#15 [ffff89da09cefe20] process_one_work at ffffffff972baf9f
#16 [ffff89da09cefe68] worker_thread at ffffffff972bc036
#17 [ffff89da09cefec8] kthread at ffffffff972c2e81
#18 [ffff89da09ceff50] ret_from_fork_nospec_begin at ffffffff97976c1d

Environment

  • Red Hat Enterprise Linux 7
  • kernel 3.10.0-957.27.2.el7.x86_64
  • QLogic based Fibre Channel interface controlled by qla2xxx driver

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content