TCP performance considerations (F-RTO and SACK)
Red Hat Insights can detect this issue
Environment
- Red Hat Enterprise Linux 6
- Red Hat Enterprise Linux 7
- TCP connection
- F-RTO is enabled on the sender's end
- TCP Selective Acknowledgement (SACK) and TCP timestamps are disabled on either the sender's or receiver's end
Issue
- Slow TCP packet flow that can affect low-latency environments
- Slow file transfer
- Slow NFS access
- Packet flow may stall for 10's of seconds up to 2 minutes
Resolution
On the sending system
Either:
- For RHEL 7, update to kernel-3.10.0-1160.el7(bz#1694860) or higher.
Or:
-
Disable F-RTO:
-
Add (or change) a line in /etc/sysctl.conf
net.ipv4.tcp_frto = 0
-
Run this command to activate the setting
# sysctl -p
-
Note: Disabling F-RTO is strongly advisable for all hard-wired RHEL 6 Servers. We recommend leaving SACK enabled in most use cases.
Root Cause
The code handing F-RTO does not correctly reset the retransmit time out timer.
For the behavior to occur the following combination must be present:
- packet losses between the sender and receiver
- F-RTO is enabled on the sender's end
- TCP Selective Acknowledgement (SACK) and TCP timestamps are disabled on either the sender's or receiver's end
F-RTO is a standard TCP protocol performance feature which provides better recovery for packet drops from random causes, as opposed to congestion. Wireless links are the major beneficiary, where packet drops occur due to radio interference. The feature was first supported in Red Hat Enterprise Linux 6 and is enabled by default per upstream recommendation, but can be disabled by a kernel tunable. It operates by sending a probe segment after a timeout-induced retransmission, where the probe is the first as-yet unacknowledged segment. When a single packet loss was the cause of the retransmit timeout and the segments following that packet were properly received, reception of the F-RTO probe will elicit a full cumulative ACK and the sender will know that the loss was not congestion-induced and can cancel congestion-control measures. Alternatively, when several segments were lost, the receiver is not fully up to date with the data sequence and should respond to the probe with an ACK only for the in-sequence data received (a dup-ACK); the sender will then know that the loss could have been caused by congestion and must continue with full recovery measures.
The most straightforward action is to disable F-RTO on the sending system(s).
References
- RFC 2001 - TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms (1997)
- RFC 4138 - Forward RTO-Recovery (2005)
- Network-related kernel tunables
- RHEL: Very low TCP connection throughput
- TCP performance issues and stalls when using kernel-3.10.0-957.21.3.el7 or any kernel with TCP SACK PANIC CVE fixes
Diagnostic Steps
The following trace shows an example of the worst case scenario. Starting at frame 282 the time between retransmission doubles until it reaches 119.9 seconds in frame 310 (column 2) and then it stays there for 2 retransmissions before recovering. The cycle repeats again starting at frame 604. The doubling of time for successive retransmissions with different sequence numbers is the key to identify this issue.
$ tshark -tdd -r test4-2020-04-09.stream-17.pcapng -Y "tcp.analysis.retransmission"
280 0.000000000 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1474902 Ack=4674 Win=35456 Len=1460
282 0.411983178 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1476362 Ack=4674 Win=35456 Len=1460
286 0.809050437 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1477822 Ack=4674 Win=35456 Len=1460
288 1.615942999 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1479282 Ack=4674 Win=35456 Len=1460
292 3.233008685 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1480742 Ack=4674 Win=35456 Len=1460
296 6.463819240 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1482202 Ack=4674 Win=35456 Len=1460
298 12.928044340 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1483662 Ack=4674 Win=35456 Len=1460
302 25.856007397 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1485122 Ack=4674 Win=35456 Len=1460
304 51.711962848 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1486582 Ack=4674 Win=35456 Len=1460
308 103.423746785 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1488042 Ack=4674 Win=35456 Len=1460
310 119.999773844 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1489502 Ack=4674 Win=35456 Len=1460
314 119.999683264 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1490962 Ack=4674 Win=35456 Len=1460
316 119.999877669 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1492422 Ack=4674 Win=35456 Len=1460
604 0.220891991 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3832982 Ack=6390 Win=35456 Len=1460
608 0.401968912 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3834442 Ack=6390 Win=35456 Len=1460
612 0.804016727 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3835902 Ack=6390 Win=35456 Len=1460
614 1.607982304 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3837362 Ack=6390 Win=35456 Len=1460
618 3.216003018 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3838822 Ack=6390 Win=35456 Len=1460
620 6.431954725 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3840282 Ack=6390 Win=35456 Len=1460
624 12.864162873 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3841742 Ack=6390 Win=35456 Len=1460
628 25.727975788 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3843202 Ack=6442 Win=35456 Len=1460
632 51.455794083 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3844662 Ack=6442 Win=35456 Len=1460
634 102.911688796 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3846122 Ack=6442 Win=35456 Len=1460
638 119.999842230 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3847582 Ack=6442 Win=35456 Len=1460
640 119.999690216 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3849042 Ack=6442 Win=35456 Len=1460
644 119.999860524 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3850502 Ack=6442 Win=35456 Len=1460
This trace shows same time doubling at frames 305 through 315 but it then recovers. The overall delay in this connection was marginal but that is due entirely to the randomness of the dropped TCP segments.
$ tshark -tdd -r test4-2020-04-09.stream-5.pcapng -Y "tcp.analysis.retransmission" | head -20
303 0.000000000 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=734606 Ack=3946 Win=27776 Len=1460
305 0.413028730 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=736066 Ack=3946 Win=27776 Len=1460
309 0.820980349 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=737526 Ack=3946 Win=27776 Len=1460
313 1.639994464 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=738986 Ack=3946 Win=27776 Len=1460
315 3.280761528 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=740446 Ack=3946 Win=27776 Len=1460
1727 0.271278004 192.168.1.15 → 192.168.1.126 1514 SSHv2 Client: [TCP Fast Retransmission] , Encrypted packet (len=1460)
1729 0.202038780 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15777126 Ack=11434 Win=35456 Len=1460
1731 0.401669941 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15778586 Ack=11434 Win=35456 Len=1460
1733 0.804237081 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15780046 Ack=11434 Win=35456 Len=1460
1737 0.001507916 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15781506 Ack=11434 Win=35456 Len=1460
1738 0.000004747 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15782966 Ack=11434 Win=35456 Len=1460
1739 0.000003852 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15784426 Ack=11434 Win=35456 Len=1460
1741 0.000726424 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15785886 Ack=11434 Win=35456 Len=1460
1742 0.000005613 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15787346 Ack=11434 Win=35456 Len=1460
1743 0.000004308 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15788806 Ack=11434 Win=35456 Len=1460
1745 0.000586095 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15790266 Ack=11434 Win=35456 Len=1460
1746 0.000008334 192.168.1.15 → 192.168.1.126 430 TCP [TCP Retransmission] 47594 → 22 [PSH, ACK] Seq=15791726 Ack=11434 Win=35456 Len=376
1747 0.000004693 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15792102 Ack=11434 Win=35456 Len=1460
4940 0.410322757 192.168.1.15 → 192.168.1.126 1514 SSHv2 Client: [TCP Fast Retransmission] , Encrypted packet (len=1460)
4943 0.200633324 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=56200334 Ack=28230 Win=35456 Len=1460
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments