Packet loss caused by VGA controller latency with nomodeset
Issue
Interaction with integrated VGA controllers can cause latency spikes on the memory bus. This can occur when nomodeset
and console=tty0
are in the boot arguments and any message is logged by the kernel (network device being set up, martian packet entering an interface, PCI device being assigned to a virtual machine, etc.).
This affects all CPUs of the NUMA node where the VGA controller is located. All isolation mechanisms are bypassed (isolcpus
, tuned cpu-partitioning
, irqbalance
, etc.). This is probably a hardware issue related to integrated GPU devices that are often found in off-the-shelf servers.
This issue impacts all latency sensitive workloads that are running on that NUMA node (OVS-DPDK, Virtual Machines running with SR-IOV, etc.). The symptoms include: random packet drops, sudden round trip latency spikes, etc.
Environment
- RHEL 8
- RHEL 9
- OpenStack Platform
- OpenShift Container Platform
- Servers with an integrated VGA controller (Matrox, Aspeed)
The exact list of affected platforms is unknown yet.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.