25 GbE and 100GbE ethernet devices not reaching expected performance on a system with an AMD EPYC CPU.

Solution Verified - Updated -

Red Hat Insights can detect this issue

Proactively detect and remediate issues impacting your systems.
View matching systems and remediation

Environment

  • Red Hat Enterprise Linux 7.4
  • AMD EPYC based servers
  • 25Gb or 100Gb Ethernet adapter

Issue

  • Getting lower than expected performance from 25GbE and 100GbE adapters on AMD EPYC based systems.
  • AMD Naples IOMMU interference with networking processing

Resolution

  • RHEL 7 - Install kernel-3.10.0-862.el7 (or newer) released via RHSA-2018:1062
  • RHEL 7.4 Extended Update Support (EUS) - Install kernel-3.10.0-693.21.1.el7 (or newer) released via RHSA-2018:0395

Work-Around

If you are running an older release, the suggested workaround is to boot the system with the kernel argument "iommu=pt". More information on how to make this change persistent is available here.

Root Cause

With newer, higher speed devices, perf data shows that the amount of MMIO that is performed when submitting commands to the IOMMU causes performance issues.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments