DM-Multipath incorrectly grouped the paths from 2 different devices during storage side reconfiguration
Issue
-
After a reconfiguration on IBM SVC it was observed that two different SAN devices were grouped under the same multipath device map as seen in following snip:
Below were the 2 separate multipath devices before this reconfiguration was done:
mpathb (3600bbbbbbbbbbbbbbbbbbbbbbbbbbbbb) dm-337 IBM,2145 size=480G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:10:108 sdbsa 67:1888 active ready running | |- 4:0:8:108 sdbsi 67:2016 active ready running | |- 1:0:9:108 sdbse 67:1952 active ready running | `- 4:0:11:108 sdbsk 68:1792 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:17:108 sdbsx 68:2000 active ready running |- 4:0:15:108 sdbsm 68:1824 active ready running |- 1:0:16:108 sdbsc 67:1920 active ready running `- 4:0:16:108 sdbso 68:1856 active ready running mpathc (3600ccccccccccccccccccccccccccccc) dm-346 IBM,2145 size=480G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:11:2 sdbgj 8:1648 active ready running | |- 4:0:10:2 sdbqp 65:1808 active ready running | |- 1:0:8:2 sdbqg 8:1920 active ready running | `- 4:0:9:2 sdbrq 66:1984 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:14:2 sdbgs 65:1536 active ready running |- 4:0:12:2 sdbqy 65:1952 active ready running |- 1:0:15:2 sdbpx 135:1776 active ready running `- 4:0:17:2 sdbrh 66:1840 active ready running -
After the reconfiguration from Storage side,
multipath -llcommand was showing paths from above 2 devices clubbed together. This had resulted in IO on incorrect devices and the database was crashed:mpathc (3600ccccccccccccccccccccccccccccc) dm-346 IBM,2145 size=480G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:11:2 sdbgj 8:1648 active ready running | |- 4:0:10:2 sdbqp 65:1808 active ready running | |- 1:0:8:2 sdbqg 8:1920 active ready running | |- 4:0:9:2 sdbrq 66:1984 active ready running | |- 1:0:10:108 sdbsa 67:1888 active ready running <---problem | |- 4:0:8:108 sdbsi 67:2016 active ready running <---problem | |- 1:0:9:108 sdbse 67:1952 active ready running <---problem | `- 4:0:11:108 sdbsk 68:1792 active ready running <---problem `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:14:2 sdbgs 65:1536 active ready running |- 4:0:12:2 sdbqy 65:1952 active ready running |- 1:0:15:2 sdbpx 135:1776 active ready running |- 4:0:17:2 sdbrh 66:1840 active ready running |- 1:0:16:108 sdbsc 67:1920 active ready running <---problem |- 4:0:15:108 sdbsm 68:1824 active ready running <---problem `- 4:0:16:108 sdbso 68:1856 active ready running <---problem -
This issue was fixed after flushing the affected multipath devices with following command and then re-scanning it:
$ multipath -f <multipath-device-name> $ multipath -v2 $ multipath -ll
Environment
- Red Hat Enterprise Linux 6, 7
- device-mapper-multipath
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.