Unresponsive storage device leads to excessive SCSI recovery and device-mapper-multipath failover times in RHEL
Issue
- My multipath device is taking a long time to switch to another path when a storage failure occurs
-
How can I configure device-mapper-multipath and SCSI devices to fail over more quickly, so that there is minimal disruption to I/O during a path failure?
-
A non-responsive SCSI target which gives no transport/link or other errors but just times out commands will trigger SCSI error recovery logic, which may take a long time, and this blocks dm-multipath from failing to another path. Such excessive time may render an expensive high-availability dual-fabric configuration ineffective, and application timeouts may be triggered
- How can I prevent applications (such as Red Hat High Availability Cluster, Oracle RAC, etc) that place a timeout on disk I/O from timing out while waiting for a SCSI or multipath device to fail?
Environment
- Red Hat Enterprise Linux (RHEL) 6
- Red Hat Enterprise Linux (RHEL) 5
- device-mapper-multipath
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.