Chapter 5. Configuring IPoIB

By default, InfiniBand does not use the internet protocol (IP) for communication. However, IP over InfiniBand (IPoIB) provides an IP network emulation layer on top of InfiniBand remote direct memory access (RDMA) networks. This allows existing unmodified applications to transmit data over InfiniBand networks, but the performance is lower than if the application would use RDMA natively.

Note

Internet Wide Area RDMA Protocol (iWARP) and RoCE networks are already IP-based. Therefore, you cannot create an IPoIB device on top of IWARP or RoCE devices.

5.1. The IPoIB communication modes

You can configure an IPoIB device either in Datagram or Connected mode. The difference is what type of queue pair the IPoIB layer attempts to open with the machine at the other end of the communication:

  • In the Datagram mode, the system opens an unreliable, disconnected queue pair.

    This mode does not support packages larger than the InfiniBand link-layer’s Maximum Transmission Unit (MTU). The IPoIB layer adds a 4 byte IPoIB header on top of the IP packet being transmitted. As a result, the IPoIB MTU must be 4 bytes less than the InfiniBand link-layer MTU. As 2048 is a common InfiniBand link-layer MTU, the common IPoIB device MTU in Datagram mode is 2044.

  • In the Connected mode, the system opens a reliable, connected queue pair.

    This mode allows messages larger than the InfiniBand link-layer MTU, and the host adapter handles packet segmentation and reassembly. As a result, there is no size limit imposed on the size of IPoIB messages that can be sent by InfiniBand adapters in Connected mode. However, IP packets are limited because of the size field and TCP/IP headers. For this reason, the IPoIB MTU in Connected mode is 65520 bytes maximum.

    The Connected mode has a higher performance, but consumes more kernel memory.

If a system is configured to use the Connected mode, it still sends multicast traffic in the Datagram mode, because InfiniBand switches and fabric cannot pass multicast traffic in Connected mode. Additionally, the system falls back to Datagram mode, when communicating with any host that is not configured in the Connected mode.

While running application that sends multicast data up to the maximum MTU on the interface, you must configure the interface in Datagram mode or configure the application to cap the packet send size at a size that will fit in datagram-sized packets.

5.2. Understanding IPoIB hardware addresses

IPoIB devices have a 20 byte hardware address that consists of the following parts:

  • The first 4 bytes are flags and queue pair numbers.
  • The next 8 bytes are the subnet prefix.

    The default subnet prefix is 0xfe:80:00:00:00:00:00:00. After the device connects to the subnet manager, the device changes this prefix to match the one configured in the subnet manager.

  • The last 8 bytes are the Globally Unique Identifier (GUID) of the InfiniBand port that the IPoIB device is attached to.
Note

Because the first 12 bytes can change, don’t use them in udev device manager rules.

Additional resources

5.3. Configuring an IPoIB connection using nmcli commands

This procedure describes how to configure an IPoIB connection using nmcli commands.

Prerequisites

  • An InfiniBand device is installed in the server, and the corresponding kernel module is loaded.

Procedure

  1. Create the InfiniBand connection. For example, to create a connection that uses the mlx4_ib0 interface in the Connected transport mode and the maximum MTU of 65520 bytes, enter:

    # nmcli connection add type infiniband con-name mlx4_ib0 ifname mlx4_ib0 transport-mode Connected mtu 65520
  2. Optional: set a P_Key interface. For example, to set 0x8002 as P_Key interface of the mlx4_ib0 connection, enter:

    # nmcli connection modify mlx4_ib0 infiniband.p-key 0x8002
  3. Configure the IPv4 settings. For example, to set a static IPv4 address, network mask, default gateway, and DNS server of the mlx4_ib0 connection, enter:

    # nmcli connection modify mlx4_ib0 ipv4.addresses '192.0.2.1/24'
    # nmcli connection modify mlx4_ib0 ipv4.gateway '192.0.2.254'
    # nmcli connection modify mlx4_ib0 ipv4.dns '192.0.2.253'
    # nmcli connection modify mlx4_ib0 ipv4.method manual
  4. Configure the IPv6 settings. For example, to set a static IPv6 address, network mask, default gateway, and DNS server of the mlx4_ib0 connection, enter:

    # nmcli connection modify mlx4_ib0 ipv6.addresses '2001:db8:1::1/32'
    # nmcli connection modify mlx4_ib0 ipv6.gateway '2001:db8:1::fffe'
    # nmcli connection modify mlx4_ib0 ipv6.dns '2001:db8:1::fffd'
    # nmcli connection modify mlx4_ib0 ipv6.method manual
  5. Activate the connection. For example, to activate the mlx4_ib0 connection:

    # nmcli connection up mlx4_ib0

5.4. Configuring an IPoIB connection using nm-connection-editor

This procedure describes how to configure an IPoIB connection using the nm-connection-editor application.

Prerequisites

  • An InfiniBand device is installed in the server, and the corresponding kernel module is loaded.
  • The nm-connection-editor package is installed.

Procedure

  1. Open a terminal, and enter:

    $ nm-connection-editor
  2. Click the + button to add a new connection.
  3. Select the InfiniBand connection type, and click Create.
  4. On the InfiniBand tab:

    1. Optionally, change the connection name.
    2. Select the transport mode.
    3. Select the device.
    4. Optional: set an MTU.
  5. On the IPv4 Settings tab, configure the IPv4 settings. For example, set a static IPv4 address, network mask, default gateway, and DNS server: infiniband IPv4 settings nm connection editor
  6. On the IPv6 Settings tab, configure the IPv6 settings. For example, set a static IPv6 address, network mask, default gateway, and DNS server: infiniband IPv6 settings nm connection editor
  7. Click Save to save the team connection.
  8. Close nm-connection-editor.
  9. Optional: set a P_Key interface. Note that you must set this parameter on the command line, because the setting is not available in nm-connection-editor.

    For example, to set 0x8002 as P_Key interface of the mlx4_ib0 connection, enter:

    # nmcli connection modify mlx4_ib0 infiniband.p-key 0x8002