TCP performance considerations (F-RTO and SACK)

Solution Verified - Updated -

Red Hat Insights can detect this issue

Proactively detect and remediate issues impacting your systems.
View matching systems and remediation

Environment

  • Red Hat Enterprise Linux 6
  • Red Hat Enterprise Linux 7
  • TCP connection
  • F-RTO is enabled on the sender's end
  • TCP Selective Acknowledgement (SACK) and TCP timestamps are disabled on either the sender's or receiver's end

Issue

  • Slow TCP packet flow that can affect low-latency environments
  • Slow file transfer
  • Slow NFS access
  • Packet flow may stall for 10's of seconds up to 2 minutes

Resolution

On the sending system
Either:

  • For RHEL 7, update to kernel-3.10.0-1160.el7(bz#1694860) or higher.

Or:

  • Disable F-RTO:

    1. Add (or change) a line in /etc/sysctl.conf

      net.ipv4.tcp_frto = 0
      
    2. Run this command to activate the setting

      # sysctl -p
      

Note: Disabling F-RTO is strongly advisable for all hard-wired RHEL 6 Servers. We recommend leaving SACK enabled in most use cases.

Root Cause

The code handing F-RTO does not correctly reset the retransmit time out timer.

For the behavior to occur the following combination must be present:

  • packet losses between the sender and receiver
  • F-RTO is enabled on the sender's end
  • TCP Selective Acknowledgement (SACK) and TCP timestamps are disabled on either the sender's or receiver's end

F-RTO is a standard TCP protocol performance feature which provides better recovery for packet drops from random causes, as opposed to congestion. Wireless links are the major beneficiary, where packet drops occur due to radio interference. The feature was first supported in Red Hat Enterprise Linux 6 and is enabled by default per upstream recommendation, but can be disabled by a kernel tunable. It operates by sending a probe segment after a timeout-induced retransmission, where the probe is the first as-yet unacknowledged segment. When a single packet loss was the cause of the retransmit timeout and the segments following that packet were properly received, reception of the F-RTO probe will elicit a full cumulative ACK and the sender will know that the loss was not congestion-induced and can cancel congestion-control measures. Alternatively, when several segments were lost, the receiver is not fully up to date with the data sequence and should respond to the probe with an ACK only for the in-sequence data received (a dup-ACK); the sender will then know that the loss could have been caused by congestion and must continue with full recovery measures.

The most straightforward action is to disable F-RTO on the sending system(s).

References

Diagnostic Steps

The following trace shows an example of the worst case scenario. Starting at frame 282 the time between retransmission doubles until it reaches 119.9 seconds in frame 310 (column 2) and then it stays there for 2 retransmissions before recovering. The cycle repeats again starting at frame 604. The doubling of time for successive retransmissions with different sequence numbers is the key to identify this issue.

$ tshark -tdd -r test4-2020-04-09.stream-17.pcapng -Y "tcp.analysis.retransmission"
  280 0.000000000 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1474902 Ack=4674 Win=35456 Len=1460
  282 0.411983178 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1476362 Ack=4674 Win=35456 Len=1460
  286 0.809050437 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1477822 Ack=4674 Win=35456 Len=1460
  288 1.615942999 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1479282 Ack=4674 Win=35456 Len=1460
  292 3.233008685 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1480742 Ack=4674 Win=35456 Len=1460
  296 6.463819240 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1482202 Ack=4674 Win=35456 Len=1460
  298 12.928044340 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1483662 Ack=4674 Win=35456 Len=1460
  302 25.856007397 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1485122 Ack=4674 Win=35456 Len=1460
  304 51.711962848 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1486582 Ack=4674 Win=35456 Len=1460
  308 103.423746785 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1488042 Ack=4674 Win=35456 Len=1460
  310 119.999773844 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1489502 Ack=4674 Win=35456 Len=1460
  314 119.999683264 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1490962 Ack=4674 Win=35456 Len=1460
  316 119.999877669 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=1492422 Ack=4674 Win=35456 Len=1460
  604 0.220891991 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3832982 Ack=6390 Win=35456 Len=1460
  608 0.401968912 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3834442 Ack=6390 Win=35456 Len=1460
  612 0.804016727 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3835902 Ack=6390 Win=35456 Len=1460
  614 1.607982304 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3837362 Ack=6390 Win=35456 Len=1460
  618 3.216003018 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3838822 Ack=6390 Win=35456 Len=1460
  620 6.431954725 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3840282 Ack=6390 Win=35456 Len=1460
  624 12.864162873 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3841742 Ack=6390 Win=35456 Len=1460
  628 25.727975788 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3843202 Ack=6442 Win=35456 Len=1460
  632 51.455794083 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3844662 Ack=6442 Win=35456 Len=1460
  634 102.911688796 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3846122 Ack=6442 Win=35456 Len=1460
  638 119.999842230 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3847582 Ack=6442 Win=35456 Len=1460
  640 119.999690216 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3849042 Ack=6442 Win=35456 Len=1460
  644 119.999860524 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47618 → 22 [ACK] Seq=3850502 Ack=6442 Win=35456 Len=1460

This trace shows same time doubling at frames 305 through 315 but it then recovers. The overall delay in this connection was marginal but that is due entirely to the randomness of the dropped TCP segments.

$ tshark -tdd -r test4-2020-04-09.stream-5.pcapng -Y "tcp.analysis.retransmission" | head -20
  303 0.000000000 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=734606 Ack=3946 Win=27776 Len=1460
  305 0.413028730 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=736066 Ack=3946 Win=27776 Len=1460
  309 0.820980349 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=737526 Ack=3946 Win=27776 Len=1460
  313 1.639994464 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=738986 Ack=3946 Win=27776 Len=1460
  315 3.280761528 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=740446 Ack=3946 Win=27776 Len=1460
 1727 0.271278004 192.168.1.15 → 192.168.1.126 1514 SSHv2 Client: [TCP Fast Retransmission] , Encrypted packet (len=1460)
 1729 0.202038780 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15777126 Ack=11434 Win=35456 Len=1460
 1731 0.401669941 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15778586 Ack=11434 Win=35456 Len=1460
 1733 0.804237081 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15780046 Ack=11434 Win=35456 Len=1460
 1737 0.001507916 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15781506 Ack=11434 Win=35456 Len=1460
 1738 0.000004747 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15782966 Ack=11434 Win=35456 Len=1460
 1739 0.000003852 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15784426 Ack=11434 Win=35456 Len=1460
 1741 0.000726424 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15785886 Ack=11434 Win=35456 Len=1460
 1742 0.000005613 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15787346 Ack=11434 Win=35456 Len=1460
 1743 0.000004308 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15788806 Ack=11434 Win=35456 Len=1460
 1745 0.000586095 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15790266 Ack=11434 Win=35456 Len=1460
 1746 0.000008334 192.168.1.15 → 192.168.1.126 430 TCP [TCP Retransmission] 47594 → 22 [PSH, ACK] Seq=15791726 Ack=11434 Win=35456 Len=376
 1747 0.000004693 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=15792102 Ack=11434 Win=35456 Len=1460
 4940 0.410322757 192.168.1.15 → 192.168.1.126 1514 SSHv2 Client: [TCP Fast Retransmission] , Encrypted packet (len=1460)
 4943 0.200633324 192.168.1.15 → 192.168.1.126 1514 TCP [TCP Retransmission] 47594 → 22 [ACK] Seq=56200334 Ack=28230 Win=35456 Len=1460

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments