Issues with RDMA and Infiniband network on RHEL 7.9

Posted on

Setup a small environment to test Infiniband throughput and running into an issue I could use some help with.

Networks looks good by most tools like iblinkinfo, ibdiagnet, ibstatus. I can ping between two servers using ibping.

But, I get errors if I try to use rping or ib_write_bw with '-z- for rdma.

Examples:
[root@016514 ~]# ib_write_bw 10.0.0.170 --report_gbits -F -a -d mlx5_0 -i 1 -z
Received 10 times ADDR_ERROR
Unable to perform rdma_client function
Unable to init the socket connection

[root@016515 ~]# rping -c -v -V -a 10.0.0.170 -C 10000
cma event RDMA_CM_EVENT_ADDR_ERROR, error -19
waiting for addr/route resolution state 1

Hardware involved is Dell r740 servers running RHEL 7.9, SB7800 and SB7790 switches, connectX-5 NICs in Infiniband mode. Interfaces in RHEL are setup for IPoIB.

Nvidia is not being much help as there are no entitlements on any of the hardware.

Responses