Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

5.275. RDMA

Updated RDMA packages that fix various bugs and add various enhancements are now available for Red Hat Enterprise Linux 6.
Red Hat Enterprise Linux includes a collection of InfiniBand and iWARP utilities, libraries and development packages for writing applications that use Remote Direct Memory Access (RDMA) technology.

Note

The RDMA packages have been upgraded to the latest upstream versions which provide a number of bug fixes and enhancements over the previous versions (BZ#739138).
BZ#814845
The rdma_bw and rdma_lat utilities provided by the perftest package are now deprecated and will be removed from the perftest package in a future update. Users should use the following utilities instead: ib_write_bw, ib_write_lat, ib_read_bw, and ib_read_lat.

Bug Fixes

BZ#696019
Previously, the rping utility did not properly join threads on shutdown. Consequently, on iWARP connections in particular, a race condition was triggered that resulted in the rping utility terminating unexpectedly with a segmentation fault. This update modifies rping to properly handle thread teardown. As a result, rping no longer crashes on iWARP connections.
BZ#700289
Previously, the kernel RDMA Connection Manager (rdmacm) did not have an option to reuse a socket port before the timeout had expired on that port after the last close. Consequently, when trying to open and close large numbers of sockets rapidly, it was possible to run out of suitable sockets that were not waiting in the timewait state. This update improves the kernel rdmacm provider to implement the SO_REUSEADDR option available for TCP/UDP sockets, which allows a socket that is closed but still in the timewait state to be reused when needed. As a result, it is now much more difficult to run out of sockets because the rdmacm provider does not need to wait for them to expire from the timewait state before they can be reused.
BZ#735954
The framework of the MVAPICH2 process manager mpirun_rsh in the mvapich2 package was broken. Consequently, all attempts to use mpirun_rsh failed. This update upgrades mpirun_rsh to later MVAPICH2 upstream sources that resolve the problem. As a result, mpirun_rsh works as expected.
BZ#747406
Previously, the permissions on the /dev/ipath* files were not permissive enough for normal users to access. Consequently, when a normal user attempted to run a Message Passing Interface (MPI) application using the Performance Scaled Messaging (PSM) Byte Transfer Layer (BTL), it failed due to the inability to open files starting with /dev/ipath. This update makes sure that the files starting with /dev/ipath have the correct permissions to be opened in read-write mode by normal users. As a result, attempts to run an MPI application using the PSM BTL succeed.
BZ#750609
Previously, mappings from InfiniBand bit values to link speeds only extended to Quad Data Rate (QDR). Consequently, attempts to use newer InfiniBand cards that supported speeds faster than QDR did not work because the stack did not understand the bit values in the link speed field. This update adds FDR (Fourteen Data Rate), FDR10, and EDR (Enhanced Data Rate) link speeds to the kernel and user space libraries. Users can now make use of newer InfiniBand cards at these higher speeds.
BZ#754196
OpenSM did not support the subnet_prefix option on the command line. Consequently, in order to have two instances of OpenSM running on two different fabrics at the same time and on the same machine, the sysadmin had to edit two different opensm.conf files and specify the subnet_prefix separately in each file in order to have different prefixes on the different subnets. With this update, OpenSM accepts a subnet_prefix option and the OpenSM init script now starts OpenSM using this option when it is being started on multiple fabrics. As a result, a sysadmin is no longer required to hand edit multiple opensm.conf files to create otherwise identical configurations that only vary by which fabric they are managing.
BZ#755459
Previously, ibv_devinfo (a program included in libibverbs-utils) did not catch bad port numbers on the command line and return an error code. Consequently, scripts could not reliably tell whether or not the command had succeeded or failed due to a bad port number. This update fixes ibv_devinfo so that it returns a non-zero error condition when a user attempts to run it on a non-existent InfiniBand device port. As a result, scripts can now tell for certain if the port value they pass to ibv_devinfo was a valid port or was out of range.
BZ#758498
Initialization of RDMA over Converged Ethernet (RoCE) based queue pairs (QPs) was not completed successfully when initialization was done through libibverbs and not through librdmacm. Consequently, attempting to open the connection failed and the following error message was displayed:
cannot transition QP to RTR state
This updated kernel stack provides a fix for the libibverbs based RoCE QP creation and now users can properly create QPs whether they use libibverbs or librdmacm as the connection initiation method.
BZ#768109
Previously, the openmpi library did not honor the tcp_port_range settings. Consequently, if users wished to limit the TCP ports that openmpi used they could not do so. This update to a later upstream version that does not have this problem allows users to now limit which TCP ports openmpi attempts to use.
BZ#768457
Previously, the shared OpenType font library libotf.so.0 was provided by both the openmpi package and the libotf package. Consequently, when an RPM spec file requested libotf.so.0 in order to operate properly, Yum could install either openmpi or libotf to satisfy the dependency, but as these two packages do not provide compatible libotf.so.0 libraries, the program might or might not work depending on whether or not the right provider was selected. The libotf.so.0 in openmpi is not intended for other applications to link against, it is an internal library. With this update, libotf.so.0 in openmpi is excluded from RPM's library identification searches. As a result, applications linking against libotf will get the right libotf, and openmpi will not accidentally be installed to satisfy the need for libotf.
BZ#773713
There was a race condition in handling of completion events in the perftest programs. Under certain conditions, the perftest program being used would terminate unexpectedly with a segmentation fault. This update adds separate send receive completion queues in place of the single completion queue for both send and receive operations. The race between the finish of a send and the finish of a receive is thereby avoided. As a result, the perftest applications no longer crash with a segmentation fault.
BZ#804002
The rds-ping tool did not check to make sure that a socket was available before sending the next ping packet. Consequently, when the timeout between packets was set very small by the user, packets could fill up all available sockets and then overwrite one of the sockets before any ping-packets were returned. This resulted in corruption in the rds-ping data structures and eventually rds-ping terminated unexpectedly with a segmentation fault. With this update, the rds-ping program stalls on sending any more packets if there are no sockets without outstanding packets. As a result, rds-ping no longer crashes with a segmentation fault when the timeout between packets is very small.
BZ#805129
Due to a bug in the libmlx4.conf modprobe configuration, usage of modprobe could result in an infinite loop of modprobe processes. If the bug was encountered, the processes would continually fork until there were no processes able to run and the system would become unresponsive. This update improves the code and as a result an incorrect configuration of options in /etc/modprobe.d/libmlx4.conf no longer results in a system that is unresponsive and that requires a hard reboot in order to be restored to proper operation.
BZ#808673
The qperf application had an outdated constant for PF_RDS in its source code that did not match the officially assigned value for PF_RDS and so qperf would compile with the wrong PF_RDS constant. Consequently, when it was run it would mistakenly think RDS (Reliable Datagram Service) was not supported on the machine even when it was and would refuse to run any RDS tests. This update removes the PF_RDS constant from the qperf source code so that it will pick up the correct constant from the system header files. As a result, qperf now properly runs RDS performance tests.
BZ#815215
The srptools RPM did not automatically add the SCSI Remote Protocol daemon (srpd) to the service list. Consequently, the chkconfig --list command would not show the srpd service at all and the service could not be enabled. The srptools RPM now properly adds the srpd init script to the list of available services (it is disabled by default). Users can now see the srpd service using chkconfig --list and can enable the srpd service with the chkconfig --level 345 srpd on command.
BZ#815622
There was a bad test in the rdma init script. Consequently, the rds module would be loaded even if the user had configured it not to load. This update corrects the test in the init script so that all conditions must be met instead of just the first condition. As a result, the rds module is only loaded when the user has configured it to be loaded or if autoloaded by the kernel due to rds usage on the local machine.

Enhancements

BZ#700285
On large InfiniBand networks, Subnet Administration service lookups consumed a large amount of bandwidth. Consequently, it could take upwards of 1 minute to look up a route from one machine to another if the network InfiniBand Subnet Manager (OpenSM) was heavily congested. This update adds the InfiniBand Communication Management Assistant (ibacm) that caches routes in a similar manner to the ARP cache for Ethernet. The ibacm program caches PathRecords from the Subnet Administration service (SA) which includes information such as MTU (Maximum Transmission Unit), SL (Service Level), SLID (Source Local Identifier) and DLID (Destination Local Identifier) for InfiniBand paths. This information is important to set up QP's properly. As a result, large subnets with many nodes will have reduced overall SA Query traffic and route lookup times.
Users of RDMA should upgrade to these updated packages, which provide numerous bug fixes and enhancements.