RDMA services fail to start after kernel update
Issue
- After a kernel update, Infiniband infrastructure no longer works.
- Systemd indicates the RDMA related services are not loading with failure messages similar to the following:
[root@<HOSTNAME>]# systemctl status rdma.service -l
* rdma.service - Initialize the iWARP/InfiniBand/RDMA stack in the kernel
Loaded: loaded (/usr/lib/systemd/system/rdma.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2018-02-12 12:05:59 EST; 4min 5s ago
Docs: file:/etc/rdma/rdma.conf
Process: 30425 ExecStart=/usr/libexec/rdma-init-kernel (code=exited, status=3)
Main PID: 30425 (code=exited, status=3)
Feb 12 12:05:59 <HOSTNAME> rdma-init-kernel[30425]: modprobe: FATAL: Module ib_mad not found.
Feb 12 12:05:59 <HOSTNAME> rdma-init-kernel[30425]: Failed to load module ib_mad
Feb 12 12:05:59 <HOSTNAME> rdma-init-kernel[30425]: modprobe: FATAL: Module ib_sa not found.
Feb 12 12:05:59 <HOSTNAME> rdma-init-kernel[30425]: Failed to load module ib_sa
Feb 12 12:05:59 <HOSTNAME> rdma-init-kernel[30425]: modprobe: FATAL: Module ib_addr not found.
Feb 12 12:05:59 <HOSTNAME> rdma-init-kernel[30425]: Failed to load module ib_addr
Feb 12 12:05:59 <HOSTNAME> systemd[1]: rdma.service: main process exited, code=exited, status=3/NOTIMPLEMENTED
Feb 12 12:05:59 <HOSTNAME> systemd[1]: Failed to start Initialize the iWARP/InfiniBand/RDMA stack in the kernel.
Feb 12 12:05:59 <HOSTNAME> systemd[1]: Unit rdma.service entered failed state.
Feb 12 12:05:59 <HOSTNAME> systemd[1]: rdma.service failed.
Environment
- Red Hat Enterprise Linux 7
- Specifically
kernel-3.10.0-514.el7
and above
- Specifically
- RDMA-specific hardware using Red Hat provided Infiniband/RDMA modules
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.