Possible TCP stack bug
Issue
- We found unexpected behavior in an application that appears to be a bug in the TCP algorithm.
- Following tcpdumps we detected that a second unnecessary re-transmission timeout (RTO) occurs after a first valid RTO
- Bug detailed description:
- The unexpected behavior appears in the server applications when TCP needs to re-transmit dropped packets. It appears in all server applications at a quite a high frequency.
- The bug appears only when the server detected a drop (by a RTO after 200ms) and at that moment it is still pending to receive the ACK for 2 packets. In that case, after 200ms of sending all packets, the RTO triggers the re-transmission of the first packet, then the ACK for that packet is received, but the second packet is not re-transmitted at that moment. After another 400ms another RTO is triggered and that second packet is re-transmitted and ACKed. To our understanding this second re-transmission should not occur. The expected behavior is that the second packet is re-transmitted right after receiving the ACK for the first re-transmitted packet.
- Also this unexpected second RTO occurs only if there are 2 pending packets at the moment of the first RTO. If there is one packet to retransmit for more than 2, the behavior is as expected, all packets are re-transmitted and ACKed after the first RTO (there is no second RTO).
Environment
- Red Hat Enterprise Linux 6
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
