hpsa controller driver hangs in error handling stage.
Issue
- hpsa controller driver error handler hangs while resetting the device.
- hpsa controller was seen in RECOVERY state in case of inbuilt hpsa controller.
host0 hpsa 0xFFFF88145B5D6800 0xFFFF88145B5D6BA0 - 0x0000000000000000 0xFFFF88145B5D6DE0 0xFFFF88145DE88000
shost_state : <5> SHOST_RECOVERY <----------------
host_busy : 8 ; commands actually active on low-level device
host_failed : 8 ; commands that failed (in above count)
host_eh_scheduled : 0 ; EH scheduled without command
resetting : -1 ; if set, last_reset is a valid value
last_reset : 0000000000000000 ; jiffies?
host_self_blocked : 0 ; Host has requested no further requests come thru for now
host_blocked : 0 ; Host has rejected a command because it was busy
max_host_blocked : 7 ; Value host_block counts down from
tmf_in_progress : 0 ; task management function in progress, typ. due to recovery
async_scan : 0 ; async scan in progress
- Devices were taken offline after messages 'Controller lockup detected: 0xffff0000 after 30' in case of third party driver.
EXT4-fs (dm-16): mounted filesystem with ordered data mode. Opts:
EXT4-fs (dm-17): mounted filesystem with ordered data mode. Opts:
levelhpsa 0000:08:00.0: logical_reset scsi 0:1:0:0: Direct-Access HP LOGICAL VOLUME RAID-1(1+0) SSDSmartPathCap- En- Exp=1 qd=1024
hpsa 0000:08:00.0: Controller lockup detected: 0xffff0000 after 30 <-------------
hpsa 0000:08:00.0: Telling controller to do an NMI
hpsa 0000:08:00.0: controller lockup detected: NULL_SDEV_PTR TAG:0x00000000:00003680 LUN:0000004000000000 CDB:01040000000000000000000000000000
hpsa 0000:08:00.0: failed 218 commands in fail_all
hpsa 0000:08:00.0: Controller Lockup detected, controller disabled.
hpsa 0000:08:00.0: resetting scsi 0:1:0:0 failed
hpsa 0000:08:00.0: scsi 0:1:0:1 RESET FAILED, lockup detected
sd 0:1:0:0: Device offlined - not ready after error recovery <--------- devices attached to hpsa controller were taken offline
sd 0:1:0:0: Device offlined - not ready after error recovery
sd 0:1:0:0: Device offlined - not ready after error recovery
[.... ]
sd 0:1:0:0: rejecting I/O to offline device
sd 0:1:0:0: rejecting I/O to offline device
[.... ]
Buffer I/O error on device dm-1, logical block 19931710
lost page write due to I/O error on dm-1 <---- filesystems went to read-only mode
sd 0:1:0:1: rejecting I/O to offline device
Buffer I/O error on device dm-1, logical block 20447266
lost page write due to I/O error on dm-1
sd 0:1:0:1: rejecting I/O to offline device
Buffer I/O error on device dm-1, logical block 20447270
Environment
- Red Hat Enterprise Linux 6
- In built driver as well as third party hpsa controller driver
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.