Latency spikes with i40e 10Gb Network Card

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 8.x
  • network-latency Tuned profile
  • Intel i40e (X710) network card.

Issue

When doing network performance tests and benchmarking between two hosts with Red Hat and i40e (Intel X710 Network Cards), also using network-latency tuned profile, it was found some latency spikes during the tests.
Those spikes have around 200ms and are usually followed by some retransmitted packets.

Resolution

In order to mitigate the issue, please upgrade to the latest kernel version available for your RHEL minor version. The fix was released for RHEL8.10 as well as the latest z-streams for other RHEL8 supported releases.

As a workaround, it's also possible to disable Adaptive ITR

ethtool -C <interface> adaptive-rx off adaptive-tx off

Or, in a permanent status, by using Network Manager

nmcli con mod <interface> ethtool.coalesce-adaptive-tx 0
nmcli con mod <interface> ethtool.coalesce-adaptive-rx 0
nmcli con up <interface>

Root Cause

Implementation of ITR on i40e network driver tries to keep that value around approximately 54us during session, but sometimes the algorithm decides to switch lower to 2us and then immediately back to 54us, which represents the experienced spikes. When this oscillation occurs, the driver may miss handling some completed descriptors from the hardware when exiting the busy poll mode. A new approach for ITR handling on this driver mitigates the experienced effect.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments