Unresponsive storage device leads to excessive SCSI recovery and device-mapper-multipath failover times in RHEL

Solution Unverified - Updated -

Issue

  • My multipath device is taking a long time to switch to another path when a storage failure occurs
  • How can I configure device-mapper-multipath and SCSI devices to fail over more quickly, so that there is minimal disruption to I/O during a path failure?

  • A non-responsive SCSI target which gives no transport/link or other errors but just times out commands will trigger SCSI error recovery logic, which may take a long time, and this blocks dm-multipath from failing to another path. Such excessive time may render an expensive high-availability dual-fabric configuration ineffective, and application timeouts may be triggered

  • How can I prevent applications (such as Red Hat High Availability Cluster, Oracle RAC, etc) that place a timeout on disk I/O from timing out while waiting for a SCSI or multipath device to fail?

Environment

  • Red Hat Enterprise Linux (RHEL) 6
  • Red Hat Enterprise Linux (RHEL) 5
  • device-mapper-multipath

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content