Netapp SAN in ALUA mode has intermittent storage hangs on Red Hat Enterprise Linux 5

Solution Verified - Updated -

Issue

  • Seeing all sorts of errors and strange results when configuring disk under Multipath. A lot of times we are seeing timeouts of 300 seconds, or errors from mpath_prio_alua. We've also tried configuring the disks using ontap, and have seen similar results.

    • For example, from the messages log:
    kernel: sd 1:0:0:215: timing out command, waited 300s
    multipathd: /sbin/mpath_prio_alua exitted with 4
    multipathd: error calling out /sbin/mpath_prio_alua /dev/sdg
    
    • Sometimes multipath fails to create all paths in a device correctly at boot time:
    360a98000375374573624424e59726c49 dm-18 NETAPP,LUN
    [size=60G][features=1 queue_if_no_path][hwhandler=1 alua][rw]
    \_ round-robin 0 [prio=50][enabled]          
     \_ 2:0:0:211 sdm  8:192  [active][ready]      <--- Only 1 path per path-group, instead of the expected 2
    \_ round-robin 0 [prio=10][enabled]
     \_ 1:0:0:211 sdc  8:32   [active][ready]      <--- Only 1 path per path-group, instead of the expected 2
    
    • We have these disks aliased in /etc/multipath.conf and sometimes the disks are showing up with the aliases, and sometimes they aren't.

Environment

  • Red Hat Enterprise Linux 5 (RHEL5)
    • Issue observed on kernel-2.6.18-238.1.1.el5 (RHEL5 update 6). Unknown if other versions are affected.
  • Netapp SAN (Vendor:Netapp; Model: SAN)
    • Configured in alua mode (default is ontap)
    • SAN Controller Firmware is prior to 1.11A5.
  • device-mapper-multipath
    - The following modifications to the default device-mapper-multipath configuration (to support alua-mode):
                getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
                prio_callout            "/sbin/mpath_prio_alua /dev/%n"
                hardware_handler        "1 alua"
  • Emulex HBA (but other HBA brands could also be affected).
    • Although there are 2 HBA's connected to the storage, all errors occur through only 1 HBA.
  • Issue may be seen on multiple Red Hat Enterprise Linux hosts connecting to the same SAN port.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content