Mutipath marked paths as "GHOST paths" and cause "Buffer I/O error" messages in logs

Solution In Progress - Updated -

Environment

  • Red Hat Enterprise Linux 5
  • Red Hat Enterprise Linux 6
  • device-mapper-multipath
  • active/passive (or ALUA standby) storage array

Issue

  • Mutipath maked paths as "GHOST paths" and cause "Buffer I/O error" messages in logs
  • Why do I see active/ghost entries in multipath maps in RHEL5? What is the importance of it?
  • Why passive path showing "ghost" flag in multipath -ll output?
  • How to get rid of 'active ghost running' state listed within multipath output in Red Hat Enterprise Linux?

    mpathb (123456789) dm-14 LSI,INF-01-00
    size=2.2T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
    |-+- policy='round-robin 0' prio=6 status=active
    | |- 7:0:0:1 sde 8:64  active ready running
    | `- 5:0:0:1 sdc 8:32  active ready running
    `-+- policy='round-robin 0' prio=1 status=enabled
      |- 4:0:0:1 sdh 8:112 active ghost running
      `- 6:0:0:1 sdd 8:48  active ghost running
                           ^^^^^^^^^^^^^^^^^^^^
    

Resolution

  • "Buffer I/O error" events to ghost paths can be safely ignored.
  • The active ghost running output from multipath -ll is normal for devices in the passive or standby state. Attempting a read from or write to a ghost path will result in a "Buffer I./O error".
  • If you have any concerns, feel free to open a ticket to Red Hat Support Representative.

Root Cause

  • The paths to those devices in ''ghost" state are passive. Devices in the passive or standby state are inaccessible to read and write I/O commands. An I/O error will occur if an application is trying to issue read or write I/O requests to them.

  • What are Ghost paths? Is it safe to have Ghost paths? Can you get ride of them?

    • "Ghost" paths indicate the device is in the standby or passive state. This means that the device will return valid responses for certain non-data movement SCSI commands such as TUR (Test Unit Ready), Inquiry, Read Capacity among others, but will fail all read and write I/O commands.

      • For example, if the ALUA state of 2h Standby is returned from a device, the output in multipath set to "ghost" for that device.
      • Other devices such that have hp_sw or rdac hardware handlers can have "ghost" paths within multipath output. This is normal for such devices.
      # multipath -ll
      [...]
      mpathb (123456789) dm-14 LSI,INF-01-00
      size=2.2T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw    <<--- 'rdac' hardware handler
      |-+- policy='round-robin 0' prio=6 status=active
      | |- 7:0:0:1 sde 8:64  active ready running
      | `- 5:0:0:1 sdc 8:32  active ready running
      `-+- policy='round-robin 0' prio=1 status=enabled   <<--- secondary (non-active/failover) path group
        |- 4:0:0:1 sdh 8:112 active ghost running
        `- 6:0:0:1 sdd 8:48  active ghost running
                                    ^^^^
                                      +------------------------ 'ghost' path state indicating passive path
      [...]
      
    • It is normal to see one path group consisting of a set of devices labeled with ghost paths to a device that is in the passive or standby state as listed above.

      • That said, it is NOT expected to see ALL of the paths listed as "ghost" on all path groups. Only those in the secondary/failover path group with status=enabled should should include ghost paths. So seeing Ghost paths is not an issue if confined to a single path group.
      • If all paths are in ghost state, then a storage side issue is likely present.
      • If only a few devices from same storage have "ghost" paths but other devices have "ready" path states on the secondary ('enabled') path group, then a storage side issue existed or exists such that path state returned from storage indicated a passive/not-ready device status within storage. If hardware handler is 'alua', then use diagnostics below to check current path status by querying storage directly.
  • Since the storage hardware provides the information that determines if a path is a ghost path or not, there is no method available to users to get rid of them or change the path state.

Diagnostic Steps

$ cat var/log/messages | grep "read failed"  awk -F'dev' '{print$2}' | awk '{print$1}' | sort | uniq
/sdb:
/sdd:
/sdf:
/sdg:
$ cat sos_commands/devicemapper/multipath_-v4_-ll
......
mpathb (3690b11c0000d4ed20000041f506ccea3) dm-1 DELL,MD36xxf
size=409G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw   [1]
|-+- policy='round-robin 0' prio=6 status=active      <-+- 1st path group             [2]
| |- 2:0:1:0 sdc 8:32  active ready running             |
| |- 2:0:3:0 sde 8:64  active ready running             |
| |- 4:0:2:0 sdh 8:112 active ready running             |
| `- 4:0:3:0 sdi 8:128 active ready running           <-+
`-+- policy='round-robin 0' prio=1 status=enabled     <-+- 2nd path group             [3]
  |- 2:0:2:0 sdd 8:48  active ghost running             |
  |- 2:0:0:0 sdb 8:16  active ghost running             |
  |- 4:0:0:0 sdf 8:80  active ghost running             | 
  `- 4:0:1:0 sdg 8:96  active ghost running           <-+
                       [4]    [5]   [6]
......

Notes:

  • [1] The 'rdac' hardware handler present indication active/passive storage configuration. Expectation is one half of the paths with be in ghost state.
  • [2] First path group is the active path group, the paths in this group will be used for read and/or write i/o commands. Note that the path group that will be used will always have the higher prio (priority) value. In this case the priority for this path group is 6 versus 1 for the second path group.
  • [3] Second path group is enabled, indicating is available for failover but will not be actively used to process read or write i/o commands. Path checking commands, such as TUR, can be used to ensure the transport path to the device is present. This path within this group will only become ready after a storage controller failover event.
  • [4] Path state :: active | failed | undef
    Active indicates a transport connection between the host and device is available.
    Failed means basic device commands like TUR or Inquiry are failing on this device indicating a storage side issue is present.
  • [5] Checked state :: ready | faulty | ghost | undef | shaky | delayed
    Ready means the path is available for read and write i/o commands.
    Ghost indicates the device is in passive or standby state and cannot accept read or write i/o commands at the current time.
  • [6] Online state :: running | offline | unknown
    Running is the nominal state within the kernel.
    Offline indicates the device has had too many errors after recovery attempts. Essentially, device isn't broken as recovery efforts work but still there is insufficient progress in completing read and/or write i/o commands after repeated recovery efforts.
    This state is from /sys/block/sd*/device/state, but /sys/block/sdN is a link to a much longer path. This longer path is seen in multipath output here:

    # multipath -ll -v4
    :
    Dec 09 11:21:14 | Discover device /sys/devices/pci0000:5b/0000:5b:02.0/0000:5d:00.2/host2/rport-2:0-11/target2:0:1/2:0:1:0/block/sdc
    Dec 09 11:21:14 | open '/sys/devices/pci0000:5b/0000:5b:02.0/0000:5d:00.2/host2/rport-2:0-11/target2:0:1/2:0:1:0/state'
    
  • When the hardware handler is 'alua', use the following commands from the sg3_utils package to determine current path state. See "[Engineering Notes] scsi INQUIRY and REPORT TARGET PORT GROUPS commands with regards to path state and priority" for more detailed information. The scsi inquiry page (sg_inq -p 0x83 ) returns a path group and relative index id. Use that pair of values to look-up the path state within the corresponding output of scsi report target port groups (sg_rtpg ) output.


device's Page 83                device's RTPG table
(group,relative ids) ------+    (g,r): path state
                           |    (g,r): path state
                           +--> (g,r): path state
                                (g,r): path state

SCSI NQUIRY PG 83 from device
# sg_inq -p 0x83 devicename
:
      Relative target port: 0x3
:
      Target port group: 0x1

>> Using 3,1 from above, look-up the current path state:

SCSI RTPG from device:    
# sg_rtpg devicename
:
    target port group id : 0x1 , Pref=0
    target port group asymmetric access state : 0x02  <== alua state is standby as defined by SCSI standard
    Relative target port ids:
      0x1
      0x2
      0x3
      0x3
  • Within the above, storage is indicating this is a ghost path by virtue of the returned state. If the path should not be a ghost path, contact storage vendor for further assistance to ascertain why the incorrect path state is being returned to the host.
  • If, per above diagnostic, the path state is currently returned as something other than standby, then you can manually force a path rescan to pick up the new path state. The kernel depends upon notification of changes to path state as it does not poll devices, and a notification event may not have been sent.

    # rescan-scsi-bus.sh              <<-- look for any added/removed device changes
    # multipath -r                           <<-- force reload paths, will force inquiry of devices and update path state if needed
    
    # multipath -ll -v4                  <<-- check updated path status, if any, with extra logging
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.