Configuring Infiniband interfaces for RoCE on Oracle OCI bare metal shapes
Environment
- Red Hat Enterprise Linux (RHEL) 8
- Red Hat Enterprise Linux (RHEL) 9
Oracle OCI bare metal shapes that support RoCE:
- BM.Optimized3.36
- BM.GPU4.8
- BM.GPU.A100-v2.8
Issue
- How is the Infiniband interface configured for RoCE on Oracle OCI bare metal shapes
Resolution
NOTE: In order to enable RDMA on supported Oracle OCI bare metal instances the instance must be created inside of a Compute Cluster. After the instance is created the Oracle provided oci-cn-auth
(Oracle OCI Cluster Network Authentication) helper tool is required to configure authentication for the interface.
Install the required Infiniband packages
# dnf install libibverbs libibverbs-utils infiniband-diags rdma-core librdmacm-utils
# modprobe rdma_cm
Install the packages required for the Oracle oci-cn-auth package
For RHEL 8
# dnf install python3-cffi python3-cryptography python3-pycparser python3-psutil python3-pyOpenSSL wpa_supplicant
For RHEL 9
NOTE: python3-pyOpenSSL was deprecated and removed in RHEL 9 but is still available in these channels,
- Openstack Tools for RHEL 9
- Red Hat OpenShift Container Platform
in the Extra Packages for Enterprise Linux (EPEL) RHEL 9 repository, or contact Oracle support 1
# dnf install python3-cffi python3-cryptography python3-pycparser python3-psutil wpa_supplicant
# rpm -ivh python3-pyOpenSSL-*.noarch.rpm
Add a firewall rule for RoCE
# firewall-cmd --permanent --zone=public --add-port=4791/udp
# firewall-cmd --reload
Install oci-cn-auth
Contact Oracle support 1 to obtain the oci-cn-auth RPM package.
# rpm -ivh oci-cn-auth-*.noarch.rpm
# systemctl enable --now oci-cn-auth
SELinux
NOTE: If SELinux is enabled on the instance then a file context mapping must be added for the client.p12 certificate. If the SELinux context is not correct there will be errors in the journal similar to "SELinux is preventing /usr/sbin/wpa_supplicant from read access on the file client.p12"
# semanage fcontext -a -t NetworkManager_etc_t "/var/run/oci-cn-auth(/.*)?"
# restorecon -vFR /var/run/oci-cn-auth
Configure the interface
# ip addr add 10.0.0.87/24 dev eth2
# ip link set dev eth2 up
# oci-cn-auth --interface eth2
Check the authentication status of oci-cn-auth
# oci-cn-auth -i
If everything is configured correctly then you will see output similar to:
current certificate valid until: 2023-07-20 22:14:30 | next certificate valid until: 2023-07-20 22:14:30
Certificate not changed. Skipping...
stat: 1689884945.3273556 1689884497.1571069
eth2 active: True authenticated: True
AUTHENTICATED
0d2558dc737cdbd3d75688205ff...
-
Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content. ↩︎ ↩︎
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments