After qla2xxx "Mailbox command timeout occured" error, adapter fails to recover
Issue
- First indication of problem in messages: "qla2xxx 0000:0f:00.0: Mailbox command timeout occured. Scheduling ISP abort."
- After above error, the system can no longer see SAN connections or reload qla2xxx drivers
- qla2xxx recovery process fails to recover adapter at this point. Unload/reload of the driver works, but in XEN environment this fails due to memory allocation problem at load time (not enough contiguous memory available within kernel)
- When able to unload/reload driver problems is fixed
Jun 21 19:50:16 node kernel: qla2xxx 0000:0f:00,0: Mailbox command timeout occured, Scheduling ISP abort, eeh_busy: 0x0
Jun 21 19:50:16 node kernel: qla2xxx 0000:0f:00,0: Performing ISP error recovery - ha= ffff88012a1584f8,
Jun 21 19:50:17 node kernel: qla2xxx 0000:0f:00,0: LIP reset occured (f7f7),
Jun 21 19:50:17 node kernel: qla2xxx 0000:0f:00,0: LIP occured (f7f7),
Jun 21 19:50:17 node kernel: qla2xxx 0000:0f:00,0: LIP reset occured (f7f7),
Jun 21 19:50:17 node kernel: qla2xxx 0000:0f:00,0: LOOP UP detected (4 Gbps),
Jun 21 19:50:53 node kernel: qla2xxx 0000:0f:00,0: Mailbox command timeout occured, Issuing ISP abort,
Jun 21 19:50:53 node kernel: qla2xxx 0000:0f:00,0: Performing ISP error recovery - ha= ffff88012a1584f8,
Jun 21 19:51:23 node kernel: qla2xxx 0000:0f:00,0: Failed mailbox send register test
Jun 21 19:51:53 node kernel: rport-8:0-0: blocked FC remote port time out: saving binding
Jun 21 19:51:53 node kernel: qla2xxx 0000:0f:00,0: [ERROR] Failed to load segment 0 of firmware
Jun 21 19:52:23 node kernel: qla2xxx 0000:0f:00,0: [ERROR] Failed to load segment 0 of firmware
Jun 21 19:53:53 node kernel: qla2xxx 0000:0f:00,0: scsi(8:0:0): Abort command issued -- 0 1 2002,
Jun 21 19:53:53 node kernel: qla2xxx 0000:0f:00,0: scsi(8:0:0): DEVICE RESET ISSUED,
- In the event of Qlogic firmware recovery process, CPU hard lockup could be observed and on X86_64 machines, NMI watchdog timer may kick in, causing system reset.
qla2xxx 0000:06:01.0: [ERROR] Failed to load segment 0 of firmware
qla2xxx 0000:06:01.0: qla2x00_abort_isp: **** FAILED ****
qla2xxx 0000:06:01.0: Performing ISP error recovery - ha= ffff8101fed3c4f8.
qla2xxx 0000:06:01.0: [ERROR] Failed to load segment 0 of firmware
qla2xxx 0000:06:01.0: [ERROR] Failed to load segment 0 of firmware
qla2xxx 0000:06:01.0: qla2x00_abort_isp: **** FAILED ****
qla2xxx 0000:06:01.0: Performing ISP error recovery - ha= ffff8101fed3c4f8.
qla2xxx 0000:06:01.0: scsi(2:0:3): DEVICE RESET ISSUED.
qla2xxx 0000:06:01.0: Failed mailbox send register test
qla2xxx 0000:06:01.0: qla2x00_abort_isp: **** FAILED ****
qla2xxx 0000:06:01.0: Performing ISP error recovery - ha= ffff8101fed3c4f8.
NMI Watchdog detected LOCKUP on CPU 1
PID: 602 TASK: ffff8103ffae9830 CPU: 1 COMMAND: "qla2xxx_2_dpc"
#0 [ffff81020710cdc0] crash_kexec at ffffffff800b127a
#1 [ffff81020710ce80] die_nmi at ffffffff80065285
#2 [ffff81020710cea0] nmi_watchdog_tick at ffffffff80065a66
#3 [ffff81020710cef0] default_do_nmi at ffffffff80065609
#4 [ffff81020710cf40] do_nmi at ffffffff800658f1
#5 [ffff81020710cf50] nmi at ffffffff80064ecf
[exception RIP: __delay+10]
RIP: ffffffff8000cb01 RSP: ffff8101feefddd8 RFLAGS: 00000002
RAX: 0000000000002201 RBX: ffff8101fed3c4f8 RCX: 00000000b99134b9
RDX: 0000000000164179 RSI: 0000000000000046 RDI: 000000000000329e
RBP: ffffc2000018c000 R8: 0000000000000002 R9: ffff8101feefdda4
R10: 0000000000000004 R11: ffffffff8022f8cf R12: 0000000000009411
R13: 0000000000000246 R14: ffff8101ffd81be8 R15: ffffffff800a3ab7
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#6 [ffff8101feefddd8] __delay at ffffffff8000cb01
#7 [ffff8101feefddd8] qla24xx_reset_chip at ffffffff88153302 [qla2xxx]
#8 [ffff8101feefde18] qla2x00_abort_isp_cleanup at ffffffff8814f6b4 [qla2xxx]
#9 [ffff8101feefde38] qla2x00_abort_isp at ffffffff88152a34 [qla2xxx]
#10 [ffff8101feefde68] qla2x00_do_dpc at ffffffff8814c55a [qla2xxx]
#11 [ffff8101feefdee8] kthread at ffffffff80032c23
#12 [ffff8101feefdf48] kernel_thread at ffffffff8005dfc1
Environment
- Red Hat Enterprise Linux
- Qlogic driver
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.