Emulex HBA unresponsive during error recovery
Issue
- lpfc driver takes a very long time to handle errors.
- IO loss / lpfc events causes filesystems to be unusable
Oct 7 21:29:20 hostname kernel: lpfc 0000:04:00.3: 1:0338 IOCB wait timeout error - no wake response Data x3c
Oct 7 21:29:20 hostname kernel: lpfc 0000:04:00.3: 1:(0):0727 TMF FCP_LUN_RESET to TGT 0 LUN 0 failed (0, 0) iocb_flag x204
Oct 7 21:29:20 hostname kernel: lpfc 0000:04:00.3: 1:(0):0713 SCSI layer issued Device Reset (0, 0) return x2007
Oct 7 21:29:22 hostname kernel: lpfc 0000:04:00.2: 0:0338 IOCB wait timeout error - no wake response Data x3c
Oct 7 21:29:22 hostname kernel: lpfc 0000:04:00.2: 0:(0):0727 TMF FCP_LUN_RESET to TGT 0 LUN 0 failed (0, 0) iocb_flag x204
Oct 7 21:29:22 hostname kernel: lpfc 0000:04:00.2: 0:(0):0713 SCSI layer issued Device Reset (0, 0) return x2007
Oct 7 21:30:30 hostname kernel: lpfc 0000:04:00.3: 1:0338 IOCB wait timeout error - no wake response Data x3c
Oct 7 21:30:30 hostname kernel: lpfc 0000:04:00.3: 1:(0):0727 TMF FCP_LUN_RESET to TGT 0 LUN 0 failed (0, 0) iocb_flag x204
Oct 7 21:30:30 hostname kernel: lpfc 0000:04:00.3: 1:(0):0713 SCSI layer issued Device Reset (0, 0) return x2007
Oct 7 21:30:32 hostname kernel: lpfc 0000:04:00.2: 0:0338 IOCB wait timeout error - no wake response Data x3c
Oct 7 21:30:32 hostname kernel: lpfc 0000:04:00.2: 0:(0):0727 TMF FCP_LUN_RESET to TGT 0 LUN 0 failed (0, 0) iocb_flag x204
Oct 7 21:30:32 hostname kernel: lpfc 0000:04:00.2: 0:(0):0713 SCSI layer issued Device Reset (0, 0) return x2007
Oct 7 21:31:30 hostname kernel: lpfc 0000:04:00.3: 1:0338 IOCB wait timeout error - no wake response Data x3c
Oct 7 21:31:30 hostname kernel: lpfc 0000:04:00.3: 1:(0):0727 TMF FCP_LUN_RESET to TGT 0 LUN 17 failed (0, 0) iocb_flag x204
Oct 7 21:31:30 hostname kernel: lpfc 0000:04:00.3: 1:(0):0713 SCSI layer issued Device Reset (0, 17) return x2007
Oct 7 21:31:32 hostname kernel: lpfc 0000:04:00.2: 0:0338 IOCB wait timeout error - no wake response Data x3c
[.... ]
Oct 7 21:32:42 166-L-DB-1A kernel: lpfc 0000:04:00.2: 0:(0):0713 SCSI layer issued Device Reset (0, 0) return x2007
Oct 7 21:33:42 166-L-DB-1A multipathd: vg_oracle1_1: sda - tur checker reports path is down
Oct 7 21:33:42 166-L-DB-1A multipathd: checker failed path 8:0 in map vg_oracle1_1
Oct 7 21:33:42 166-L-DB-1A kernel: lpfc 0000:04:00.2: 0:0338 IOCB wait timeout error - no wake response Data x3c
Oct 7 21:33:42 166-L-DB-1A kernel: lpfc 0000:04:00.2: 0:(0):0727 TMF FCP_TARGET_RESET to TGT 0 LUN 0 failed (0, 0) iocb_flag x204
Oct 7 21:33:42 166-L-DB-1A kernel: lpfc 0000:04:00.2: 0:(0):0700 Bus Reset on target 0 failed
Oct 7 21:33:42 166-L-DB-1A kernel: lpfc 0000:04:00.2: 0:(0):0714 SCSI layer issued Bus Reset Data: x2003
Oct 7 21:33:42 166-L-DB-1A kernel: sd 0:0:0:0: Device offlined - not ready after error recovery <<<<<<<<<
Oct 7 21:33:42 166-L-DB-1A kernel: sd 0:0:0:0: Device offlined - not ready after error recovery
Oct 7 21:33:42 166-L-DB-1A kernel: sd 0:0:0:0: Device offlined - not ready after error recovery
Oct 7 21:33:42 166-L-DB-1A kernel: sd 0:0:0:0: [sda] Unhandled error code
Oct 7 21:33:42 166-L-DB-1A kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_REQUEUE driverbyte=DRIVER_OK
Oct 7 21:33:42 166-L-DB-1A kernel: sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 12 c7 f1 60 00 00 08 00
Oct 7 21:33:42 166-L-DB-1A kernel: end_request: I/O error, dev sda, sector 315093344
Environment
- Red Hat Enterprise Linux 6
- Red Hat Enterprise Linux 5.8
- lpfc driver
- FCoE, Fibre Channel
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.