Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

13.7. Testing Early InfiniBand RDMA operation

Note

This section applies only to InfiniBand devices. Since iWARP and RoCE/IBoE devices are IP based devices, users should proceed to the section on testing RDMA operations once IPoIB has been configured and the devices have IP addresses.
Once the rdma service is enabled, and the opensm service (if needed) is enabled, and the proper user-space library for the specific hardware has been installed, user space rdma operation should be possible. Simple test programs from the libibverbs-utils package are helpful in determining that RDMA operations are working properly. The ibv_devices program will show which devices are present in the system and the ibv_devinfo command will give detailed information about each device. For example:
~]$ ibv_devices
    device                 node GUID
    ------              ----------------
    mlx4_0              0002c903003178f0
    mlx4_1              f4521403007bcba0
~]$ ibv_devinfo -d mlx4_1
hca_id: mlx4_1
        transport:                      InfiniBand (0)
        fw_ver:                         2.30.8000
        node_guid:                      f452:1403:007b:cba0
        sys_image_guid:                 f452:1403:007b:cba3
        vendor_id:                      0x02c9
        vendor_part_id:                 4099
        hw_ver:                         0x0
        board_id:                       MT_1090120019
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             2048 (4)
                        sm_lid:                 2
                        port_lid:               2
                        port_lmc:               0x01
                        link_layer:             InfiniBand

                port:   2
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet
~]$ ibstat mlx4_1
CA 'mlx4_1'
        CA type: MT4099
        Number of ports: 2
        Firmware version: 2.30.8000
        Hardware version: 0
        Node GUID: 0xf4521403007bcba0
        System image GUID: 0xf4521403007bcba3
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 2
                LMC: 1
                SM lid: 2
                Capability mask: 0x0251486a
                Port GUID: 0xf4521403007bcba1
                Link layer: InfiniBand
        Port 2:
                State: Active
                Physical state: LinkUp
                Rate: 40
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x04010000
                Port GUID: 0xf65214fffe7bcba2
                Link layer: Ethernet
The ibv_devinfo and ibstat commands output slightly different information (such as port MTU exists in ibv_devinfo but not in ibstat output, and the Port GUID exists in ibstat output but not in ibv_devinfo output), and a few things are named differently (for example, the Base local identifier (LID) in ibstat output is the same as the port_lid output of ibv_devinfo)
Simple ping programs, such as ibping from the infiniband-diags package, can be used to test RDMA connectivity. The ibping program uses a client-server model. You must first start an ibping server on one machine, then run ibping as a client on another machine and tell it to connect to the ibping server. Since we are wanting to test the base RDMA capability, we need to use an RDMA specific address resolution method instead of IP addresses for specifying the server.
On the server machine, the user can use the ibv_devinfo and ibstat commands to print out the port_lid (or Base lid) and the Port GUID of the port they want to test (assuming port 1 of the above interface, the port_lid/Base LID is 2 and Port GUID is 0xf4521403007bcba1)). Then start ibping with the necessary options to bind specifically to the card and port to be tested, and also specifying ibping should run in server mode. You can see the available options to ibping by passing -? or --help, but in this instance we will need either the -S or --Server option and for binding to the specific card and port we will need either -C or --Ca and -P or --Port. Note: port in this instance does not denote a network port number, but denotes the physical port number on the card when using a multi-port card. To test connectivity to the RDMA fabric using, for example, the second port of a multi-port card, requires telling ibping to bind to port 2 on the card. When using a single port card, or testing the first port on a card, this option is not needed. For example:
~]$ ibping -S -C mlx4_1 -P 1
Then change to the client machine and run ibping. Make note of either the port GUID of the port the server ibping program is bound to, or the local identifier (LID) of the port the server ibping program is bound to. Also, take note which card and port in the client machine is physically connected to the same network as the card and port that was bound to on the server. For example, if the second port of the first card on the server was bound to, and that port is connected to a secondary RDMA fabric, then on the client specify whichever card and port are necessary to also be connected to that secondary fabric. Once these things are known, run the ibping program as a client and connect to the server using either the port LID or GUID that was collected on the server as the address to connect to. For example:
~]$ ibping -c 10000 -f -C mlx4_0 -P 1 -L 2
--- rdma-host.example.com.(none) (Lid 2) ibping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 816 ms
rtt min/avg/max = 0.032/0.081/0.446 ms
or
~]$ ibping -c 10000 -f -C mlx4_0 -P 1 -G 0xf4521403007bcba1 \
--- rdma-host.example.com.(none) (Lid 2) ibping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 769 ms
rtt min/avg/max = 0.027/0.076/0.278 ms
This outcome verifies that end to end RDMA communications are working for user space applications.
The following error may be encountered:
~]$ ibv_devinfo
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
No IB devices found
This error indicates that the necessary user-space library is not installed. The administrator will need to install one of the user-space libraries (as appropriate for their hardware) listed in section Section 13.4, “InfiniBand and RDMA related software packages”. On rare occasions, this can happen if a user installs the wrong arch type for the driver or for libibverbs. For example, if libibverbs is of arch x86_64, and libmlx4 is installed but is of type i686, then this error can result.

Note

Many sample applications prefer to use host names or addresses instead of LIDs to open communication between the server and client. For those applications, it is necessary to set up IPoIB before attempting to test end-to-end RDMA communications. The ibping application is unusual in that it will accept simple LIDs as a form of addressing, and this allows it to be a simple test that eliminates possible problems with IPoIB addressing from the test scenario and therefore gives us a more isolated view of whether or not simple RDMA communications are working.