NICs with RX acceleration (GRO/LRO/TPA/etc) may suffer from bad TCP performance
Environment
-
Red Hat Enterprise Linux versions:
- Broadcom
bnx2x
module prior to RHEL 5.7 (kernel earlier than2.6.18-274.el5
) - QLogic NetXen
netxen_nic
module prior to RHEL 5.9 (kernel earlier than2.6.18-348.el5
) - Intel 10Gbps
ixgbe
module prior to RHEL 6.4 (kernel earlier than2.6.32-358.el6
) - Intel 10Gbps ixgbe module from RHEL 5.6 (kernel version
2.6.18-238.el5
and later)
- Broadcom
-
Receive offloading enabled on network interface
Issue
Network interface cards (NIC) with receive (RX) acceleration (GRO, LRO, TPA, etc) may suffer from bad performance. Some effects include:
- NFS transfers over 10Gbps links are only transferring at 100MiB/sec (i.e. 1Gbps)
- TCP connections never reach anywhere near wirespeed
- In
tcpdump
, we observed TCP Window clamp down to a small value like 720 bytes and never recover
Resolution
Solution
Upgrade to the following kernel versions:
- Broadcom
bnx2x
- RHEL 5.7kernel-2.6.18-274.el5
- QLogic NetXen
netxen_nic
- RHEL 5.9kernel-2.6.18-348.el5
- Intel 10Gbps
ixgbe
- RHEL 6.4kernel-2.6.32-358.el6
- There is no resolution on RHEL5 for Intel 10Gbps
ixgbe
Workaround
-
Disable GRO/LRO/TPA or other RX/receive accelerations.
-
Other NICs may be handled via
ethtool
tool with a command such as:
# ethtool -K eth0 gro off
# ethtool -K eth0 lro off
-
Please write an
/sbin/ifup-local
script to persist theethtool
configurations. -
For
bnx2x
, offloading can be controlled by a module option in/etc/modprobe.conf
:
options bnx2x disable_tpa=1
A module option such as this requires a module reload or a reboot to apply.
Root Cause
-
The receive MSS estimate was miscalculated from the raw packet size if the packet was GRO/LRO/TPA, which confuses TCP stack.
-
This issue was tracked via the following Red Hat Private Bugs: 629609 651546 653357 656360 786403 819647 819101.
-
The bugs listed above are closed with the following resolutions:
- 629609, 651546, 653357, 656360 -
bnx2x
driver update in RHEL 5.7: https://access.redhat.com/errata/RHSA-2011:1065 - 786403 -
netxen_nic
driver update in RHEL 5.9: https://access.redhat.com/errata/RHBA-2013:0006 - 819647, 819101 -
ixgbe
driver update in RHEL 6.4: https://access.redhat.com/errata/RHSA-2013:0496
Fornetxen_nic
specifically:
- 629609, 651546, 653357, 656360 -
Due to incorrect information provided by firmware, the netxen_nic driver did
not calculate the correct Generic Segmentation Offload (GSO) length of
packets that were received using the Large Receive Offload (LRO)
optimization. This caused network traffic flow to be extensively delayed for
NICs using LRO on netxen_nic, which had a huge impact on NIC's performance
(in some cases, throughput for some 1 GB NICs could be below 100 kbs). With
this update, firmware now provides the correct GSO packet length and the
netxen_nic driver has been modified to handle new information provided by
firmware correctly. Throughput of the NICs using the LRO optimization with
the netxen_nic driver is now within expected levels.
Diagnostic Steps
Two main effects were observed:
- TCP connection window never increase large enough (staying as low as 720 bytes). Can be checked via general traffic captures.
- RHEL host may delay 40ms to send an awaited ACK - improper activation of delayed ack mechanism. Observe the TCP connection using wireshark. Open IO Graphs, with 0.1 second per tick and displaying packets/tick. There will be some valleys when delayed ACKs occour.
Check if receive offloading is enabled:
$ grep 'receive-offload' sos_commands/networking/ethtool_-k_eth0 | grep ': on'
generic-receive-offload: on
large-receive-offload: on
Try to disable it and see if issue improves:
# ethtool -K eth0 gro off lro off
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments