System crashed in lpfc_abort_handler/lpfc_sli_flush_io_rings after HBA reset

Solution Verified - Updated -

Issue

  • System crashes in lpfc abort handler after HBA reset:
device-mapper: multipath: 253:104: Failing path 66:288.
device-mapper: multipath: 253:90: Failing path 65:336.
 rport-4:0-4: blocked FC remote port time out: removing target and saving binding
lpfc 0000:86:00.0: start 114 end 117 cnt 3
lpfc 0000:86:00.0: 114: [790296.164206] 2:3123 Report dump event to upper layer
lpfc 0000:86:00.0: 115: [790312.624519] 2:(0):3181 dev_loss_callbk x56d380, rport xffffa03da0f94000 flg x0 load_flag x4 refcnt 2 state 8 xpt x2
lpfc 0000:86:00.0: 116: [790312.624534] 2:(0):3182 lpfc_dev_loss_tmo_handler x56d380, nflag x800000 xflags x0 refcnt 3
lpfc 0000:86:00.0: 2:(0):0203 Devloss timeout on WWPN 50:06:0e:80:12:b3:bf:53 NPort x56d380 Data: x800000 x8 xffff refcnt 3
**** lpfc_rport_invalid: NULL ndlp on rport xffffa03da0f94000 SID x0
BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
PGD 0 P4D 0 
Oops: 0000 [#1] SMP NOPTI
CPU: 23 PID: 1587984 Comm: kworker/u64:1 Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-513.11.1.el8_9.x86_64 #1
Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 07/14/2022
Workqueue: scsi_tmf_4 scmd_eh_abort_handler
RIP: 0010:lpfc_abort_handler+0x1e0/0x560 [lpfc]
....
Call Trace:
 ? __die_body+0x1a/0x60
 ? no_context+0x1ba/0x3f0
 ? __bad_area_nosemaphore+0x16c/0x1c0
 ? do_page_fault+0x37/0x130
 ? page_fault+0x1e/0x30
 ? lpfc_abort_handler+0x1e0/0x560 [lpfc]
 ? __switch_to_asm+0x43/0x80
 scmd_eh_abort_handler+0xcc/0x320
....

Environment

  • Red Hat Enterprise Linux 8

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content