Paths lost when ALUA transition occurs on Netapp SAN and device-mapper-multipath

Solution Unverified - Updated -

Issue

  • 4 nodes connected to an active/active SAN Array (NetApp).

    • when maintenance done on one of controller pair the service is made to take over by other active pair and it work fine, but after the maintenance when the controller is activated there is a short period of time where all paths are dead (the time for the system to reinstate the 2 original paths ) .
    • In this case the LVM set on top of the multipath devices seems to fails, and remount the file system in read only.
    • The Various timeout and retry parameters on the HBA firmware are set by following the same SAN provider recommendations but it still happens.
  • Suddenly 4 pathes failed and cluster services failover to another host. After checking the server, all 4 paths are normal. master host fenced. then they saw one minutes, some path recovered.

    • No other Unix/Wintel servers that connected to this Netapp storage were affected during Netapp failover.

Environment

  • Red Hat Enterprise Linux 5 (RHEL5)
  • Device-mapper-multipath

    • Either using the default setting for NETAPP LUN SAN (ie. not ALUA config) or not configured to use the ALUA hardware handler:
    # multipath -ll mpath4
    mpath4 (360a980006465614d52346d347a617775) dm-11 NETAPP,LUN
    [size=70G][features=0][hwhandler=0][rw]           <-- hwhandler=0
    \_ round-robin 0 [prio=8][active]
     \_ 6:0:0:2 sde  8:64   [active][ready]
     \_ 5:0:0:2 sds  65:32  [active][ready]
    \_ round-robin 0 [prio=2][enabled]
     \_ 6:0:1:2 sdl  8:176  [active][ready]
     \_ 5:0:1:2 sdz  65:144 [active][ready]
    
  • Netapp LUN SAN:

    • SAN will report as NETAPP LUN in /proc/scsi/scsi:
    # cat /proc/scsi/scsi
    Host: scsi6 Channel: 00 Id: 00 Lun: 00
      Vendor: NETAPP   Model: LUN              Rev: 811a
      Type:   Direct-Access                    ANSI SCSI revision: 05
    Host: scsi6 Channel: 00 Id: 00 Lun: 01
      Vendor: NETAPP   Model: LUN              Rev: 811a
      Type:   Direct-Access                    ANSI SCSI revision: 05
    <.. some data omitted ...>
    
    • SAN has been configured in ALUA mode.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content