Unresponsive storage device leads to excessive SCSI recovery and device-mapper-multipath failover times in RHEL
Issue
- My multipath device is taking a long time to switch to another path when a storage failure occurs
-
How can I configure device-mapper-multipath and SCSI devices to fail over more quickly, so that there is minimal disruption to I/O during a path failure?
-
A non-responsive SCSI target which gives no transport/link or other errors but just times out commands will trigger SCSI error recovery logic, which may take a long time, and this blocks dm-multipath from failing to another path. Such excessive time may render an expensive high-availability dual-fabric configuration ineffective, and application timeouts may be triggered
- How can I prevent applications (such as Red Hat High Availability Cluster, Oracle RAC, etc) that place a timeout on disk I/O from timing out while waiting for a SCSI or multipath device to fail?
Environment
- Red Hat Enterprise Linux (RHEL) 6
- Red Hat Enterprise Linux (RHEL) 5
- device-mapper-multipath
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
