multipath is showing one faulty path for each LUN in RHEL 5

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 5.10

Issue

  • RHEL 5.10 system attached to HP P9500 Storage Array SAN is showing 1 of 4 LUN paths as faulty. All LUN's are showing the second path as failed.
     2:0:1:0 sdae 65:224 0   [undef][faulty] HP,OPEN-V    
     2:0:1:1 sdaf 65:240 0   [undef][faulty] HP,OPEN-V    
     2:0:1:2 sdag 66:0   0   [undef][faulty] HP,OPEN-V    
     2:0:1:3 sdah 66:16  0   [undef][faulty] HP,OPEN-V    
     2:0:1:4 sdai 66:32  0   [undef][faulty] HP,OPEN-V    
     2:0:1:5 sdaj 66:48  0   [undef][faulty] HP,OPEN-V    
     2:0:1:6 sdak 66:64  0   [undef][faulty] HP,OPEN-V    
     2:0:1:7 sdal 66:80  0   [undef][faulty] HP,OPEN-V    
     2:0:1:8 sdam 66:96  0   [undef][faulty] HP,OPEN-V    
     2:0:1:9 sdan 66:112 0   [undef][faulty] HP,OPEN-V  

Resolution

  • If a closed ELS event was received, that would explain the lack of connectivity. To reinitialize the exchange, a LIP would need to be issued for the faulty host. As an example, this can be performed by the following:
# echo 1 > /sys/class/fc_host/host2/issue_lip

The above should reinitialize the login exchange and bring the paths back once a successful path_checker iteration completes.

Root Cause

  • There is a FCP Target port in 'Not Present' port_state and it was determined that there was a configuration change on the port that cause the port to go down:
Class Device = "0-3"
  Class Device path = "/sys/class/fc_remote_ports/rport-2:0-3"
    dev_loss_tmo        = "30"
    fast_io_fail_tmo    = "off"
    maxframe_size       = "4294967295 bytes"
    node_name           = "0xffffffffffffffff"
    port_id             = "0xffffffff"
    port_name           = "0x50060e8016049100"
    port_state          = "Not Present"
    roles               = "unknown"
    scsi_target_id      = "1"

If an FCP Target port does not respond within dev_loss_tmo, the port is placed in 'Not Present' state and access to targets are blocked, the signature remains to persist the scsi target id binding in case the port is re-established. All IO, including the multipathd path_checker are blocked.

Diagnostic Steps

Get the output of the following commands:

# systool -c fc_remote_ports -v 
# systool -c fc_host -v 
# for d in $(ls /sys/block |grep sd); do echo $d - `cat /sys/block/$d/device/state`; done

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments