Red Hat Training

A Red Hat training course is available for RHEL 8

Chapter 4. Enabling multipathing on NVMe devices

You can multipath NVMe devices that are connected to your system over a fabric transport, such as Fibre Channel (FC). You can select between multiple multipathing solutions.

4.1. Native NVMe multipathing and DM Multipath

NVMe devices support a native multipathing functionality. When configuring multipathing on NVMe, you can select between the standard DM Multipath framework and the native NVMe multipathing.

Both DM Multipath and native NVMe multipathing support the Asymmetric Namespace Access (ANA) multipathing scheme of NVMe devices. ANA identifies optimized paths between the target and the initiator and improves performance.

When native NVMe multipathing is enabled, it applies globally to all NVMe devices. It can provide higher performance, but does not contain all of the functionality that DM Multipath provides. For example, native NVMe multipathing supports only the failover and round-robin path selection methods.

Red Hat recommends that you use DM Multipath in RHEL 8 as your default multipathing solution.

4.2. Enabling native NVMe multipathing

This procedure enables multipathing on connected NVMe devices using the native NVMe multipathing solution.

Prerequisites

Procedure

  1. Check if native NVMe multipathing is enabled in the kernel:

    # cat /sys/module/nvme_core/parameters/multipath

    The command displays one of the following:

    N
    Native NVMe multipathing is disabled.
    Y
    Native NVMe multipathing is enabled.
  2. If native NVMe multipathing is disabled, enable it using one of the following methods:

    • Using a kernel option:

      1. Add the nvme_core.multipath=Y option on the kernel command line:

        # grubby --update-kernel=ALL --args="nvme_core.multipath=Y"
      2. On the 64-bit IBM Z architecture, update the boot menu:

        # zipl
      3. Reboot the system.
    • Using a kernel module configuration file:

      1. Create the /etc/modprobe.d/nvme_core.conf configuration file with the following content:

        options nvme_core multipath=Y
      2. Back up the initramfs file system:

        # cp /boot/initramfs-$(uname -r).img \
             /boot/initramfs-$(uname -r).bak.$(date +%m-%d-%H%M%S).img
      3. Rebuild the initramfs file system:

        # dracut --force --verbose
      4. Reboot the system.
  3. Optional: On the running system, change the I/O policy on NVMe devices to distribute the I/O on all available paths:

    # echo "round-robin" > /sys/class/nvme-subsystem/nvme-subsys0/iopolicy
  4. Optional: Set the I/O policy persistently using udev rules. Create the /etc/udev/rules.d/71-nvme-io-policy.rules file with the following content:

    ACTION=="add|change", SUBSYSTEM=="nvme-subsystem", ATTR{iopolicy}="round-robin"

Verification

  1. Check that your system recognizes the NVMe devices:

    # nvme list
    
    Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
    ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
    /dev/nvme0n1     a34c4f3a0d6f5cec     Linux                                    1         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme0n2     a34c4f3a0d6f5cec     Linux                                    2         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
  2. List all connected NVMe subsystems:

    # nvme list-subsys
    
    nvme-subsys0 - NQN=testnqn
    \
     +- nvme0 fc traddr=nn-0x20000090fadd597a:pn-0x10000090fadd597a host_traddr=nn-0x20000090fac7e1dd:pn-0x10000090fac7e1dd live
     +- nvme1 fc traddr=nn-0x20000090fadd5979:pn-0x10000090fadd5979 host_traddr=nn-0x20000090fac7e1dd:pn-0x10000090fac7e1dd live
     +- nvme2 fc traddr=nn-0x20000090fadd5979:pn-0x10000090fadd5979 host_traddr=nn-0x20000090fac7e1de:pn-0x10000090fac7e1de live
     +- nvme3 fc traddr=nn-0x20000090fadd597a:pn-0x10000090fadd597a host_traddr=nn-0x20000090fac7e1de:pn-0x10000090fac7e1de live

    Check the active transport type. For example, nvme0 fc indicates that the device is connected over the Fibre Channel transport, and nvme tcp indicates that the device is connected over TCP.

  3. If you edited the kernel options, check that native NVMe multipathing is enabled on the kernel command line:

    # cat /proc/cmdline
    
    BOOT_IMAGE=[...] nvme_core.multipath=Y
  4. Check that DM Multipath reports the NVMe namespaces as, for example, nvme0c0n1 through nvme0c3n1, and not as, for example, nvme0n1 through nvme3n1:

    # multipath -ll | grep -i nvme
    
    uuid.8ef20f70-f7d3-4f67-8d84-1bb16b2bfe03 [nvme]:nvme0n1 NVMe,Linux,4.18.0-2
    | `- 0:0:1    nvme0c0n1 0:0     n/a   optimized live
    | `- 0:1:1    nvme0c1n1 0:0     n/a   optimized live
    | `- 0:2:1    nvme0c2n1 0:0     n/a   optimized live
      `- 0:3:1    nvme0c3n1 0:0     n/a   optimized live
    
    uuid.44c782b4-4e72-4d9e-bc39-c7be0a409f22 [nvme]:nvme0n2 NVMe,Linux,4.18.0-2
    | `- 0:0:1    nvme0c0n1 0:0     n/a   optimized live
    | `- 0:1:1    nvme0c1n1 0:0     n/a   optimized live
    | `- 0:2:1    nvme0c2n1 0:0     n/a   optimized live
      `- 0:3:1    nvme0c3n1 0:0     n/a   optimized live
  5. If you changed the I/O policy, check that round-robin is the active I/O policy on NVMe devices:

    # cat /sys/class/nvme-subsystem/nvme-subsys0/iopolicy
    
    round-robin

Additional resources

4.3. Enabling DM Multipath on NVMe devices

This procedure enables multipathing on connected NVMe devices using the DM Multipath solution.

Prerequisites

Procedure

  1. Check that native NVMe multipathing is disabled:

    # cat /sys/module/nvme_core/parameters/multipath

    The command displays one of the following:

    N
    Native NVMe multipathing is disabled.
    Y
    Native NVMe multipathing is enabled.
  2. If native NVMe multipathing is enabled, disable it:

    1. Remove the nvme_core.multipath=Y option from the kernel command line:

      # grubby --update-kernel=ALL --remove-args="nvme_core.multipath=Y"
    2. On the 64-bit IBM Z architecture, update the boot menu:

      # zipl
    3. Remove the options nvme_core multipath=Y line from the /etc/modprobe.d/nvme_core.conf file, if it is present.
    4. Reboot the system.
  3. Make sure that DM Multipath is enabled:

    # systemctl enable --now multipathd.service
  4. Distribute I/O on all available paths. Add the following content in the /etc/multipath.conf file:

    device {
      vendor "NVME"
      product ".*"
      path_grouping_policy    group_by_prio
    }
    Note

    The /sys/class/nvme-subsystem/nvme-subsys0/iopolicy configuration file has no effect on the I/O distribution when DM Multipath manages the NVMe devices.

  5. Reload the multipathd service to apply the configuration changes:

    # multipath -r
  6. Back up the initramfs file system:

    # cp /boot/initramfs-$(uname -r).img \
         /boot/initramfs-$(uname -r).bak.$(date +%m-%d-%H%M%S).img
  7. Rebuild the initramfs file system:

    # dracut --force --verbose

Verification

  1. Check that your system recognizes the NVMe devices:

    # nvme list
    
    Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
    ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
    /dev/nvme0n1     a34c4f3a0d6f5cec     Linux                                    1         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme0n2     a34c4f3a0d6f5cec     Linux                                    2         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme1n1     a34c4f3a0d6f5cec     Linux                                    1         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme1n2     a34c4f3a0d6f5cec     Linux                                    2         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme2n1     a34c4f3a0d6f5cec     Linux                                    1         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme2n2     a34c4f3a0d6f5cec     Linux                                    2         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme3n1     a34c4f3a0d6f5cec     Linux                                    1         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
    /dev/nvme3n2     a34c4f3a0d6f5cec     Linux                                    2         250.06  GB / 250.06  GB    512   B +  0 B   4.18.0-2
  2. List all connected NVMe subsystems. Check that the command reports them as, for example, nvme0n1 through nvme3n2, and not as, for example, nvme0c0n1 through nvme0c3n1:

    # nvme list-subsys
    
    nvme-subsys0 - NQN=testnqn
    \
     +- nvme0 fc traddr=nn-0x20000090fadd5979:pn-0x10000090fadd5979 host_traddr=nn-0x20000090fac7e1dd:pn-0x10000090fac7e1dd live
     +- nvme1 fc traddr=nn-0x20000090fadd597a:pn-0x10000090fadd597a host_traddr=nn-0x20000090fac7e1dd:pn-0x10000090fac7e1dd live
     +- nvme2 fc traddr=nn-0x20000090fadd5979:pn-0x10000090fadd5979 host_traddr=nn-0x20000090fac7e1de:pn-0x10000090fac7e1de live
     +- nvme3 fc traddr=nn-0x20000090fadd597a:pn-0x10000090fadd597a host_traddr=nn-0x20000090fac7e1de:pn-0x10000090fac7e1de live
    # multipath -ll
    
    mpathae (uuid.8ef20f70-f7d3-4f67-8d84-1bb16b2bfe03) dm-36 NVME,Linux
    size=233G features='1 queue_if_no_path' hwhandler='0' wp=rw
    `-+- policy='service-time 0' prio=50 status=active
      |- 0:1:1:1  nvme0n1 259:0   active ready running
      |- 1:2:1:1  nvme1n1 259:2   active ready running
      |- 2:3:1:1  nvme2n1 259:4   active ready running
      `- 3:4:1:1  nvme3n1 259:6   active ready running
    
    mpathaf (uuid.44c782b4-4e72-4d9e-bc39-c7be0a409f22) dm-39 NVME,Linux
    size=233G features='1 queue_if_no_path' hwhandler='0' wp=rw
    `-+- policy='service-time 0' prio=50 status=active
      |- 0:1:2:2  nvme0n2 259:1   active ready running
      |- 1:2:2:2  nvme1n2 259:3   active ready running
      |- 2:3:2:2  nvme2n2 259:5   active ready running
      `- 3:4:2:2  nvme3n2 259:7   active ready running

Additional resources