No TCP FIN when task killed, leaving local link in FIN_WAIT1/ESTABLISHED state
We are seeing some TCP behavior that we have not seen before. Normally when we kill a task that has a link over the local interface to another local task, in tcpdump we see that TCP sends a [FIN, ACK] from the side that died to the running task. The running task sends an [ACK] back. Then the side that died sends an [RST] back to the running task, and the running task is notified via normal socket ops that the link has disappeared.
Occasionally when the link has been up for a number of days, when we kill a task we are not seeing any traffic in tcpdump between the two sides. The task that is killed disappears. The link in netstat going to the running task has no client process listed and is in the FIN_WAIT1 state. The link in the other direction from the running task to the killed task is still in ESTABLISHED. The socket layer in the running task is never notified that the link has disappeared.
In both cases, when we receive the notification and when we don't, the link has been up for at least three or four days with no traffic. If we kill the task in the first couple of days, we always get the [FIN]. Kernel version is 2.6.32-573.12.1.el6.x86_64 (RHEL 6.7). Any ideas?
Responses
For the TCP Reset, if there is data in the receive buffer when the socket closes then a reset is sent. That's expected:
For the mystery socket sitting in FIN_WAIT1 and ESTABLISHED, try reproduce on the latest EL6 kernel, it might be something already fixed.
Ah, I understand now.
You kill the process, the socket file descriptor is closed and a FIN is sent by TCP, but it sounds like the remote end never gets the FIN. The active closer socket correctly sits in FIN_WAIT1 which is an orphan state, so is retried and eventually times out. The passive closer never actually got the FIN so continues to sit in ESTABLISHED.
Everything's working fine from the OS perspective. Figure out why the FIN sent from the active closer never reaches the other end.
Capture on the active closer to confirm the FIN is actually submitted to the NIC for transmission. Capture on a mirror port of the active closer to confirm the NIC actually puts the FIN on the wire. Capture on the passive closer to see if the FIN is received. I suspect you'll see the FIN going to the NIC on the sender, going out the NIC on the switch, but not arriving at the other end. Then troubleshoot the network in between.
If you have a stateful firewall on the network path, make sure its session timeout is longer than the session time. It could be the firewall kills the session after 48 hours, which is why you never see the behaviour in the first two days, but do see the issue afterwards. You may be able to have the application do some sort of keepalives on its socket to refresh the firewall's timer.
Quinn has been on vacation. My name is Ray and I have been working on this as well. I am suspecting conntrack because I see the following line in sysctl -a on the customer's system: net.netfilter.nf_conntrack_tcp_timeout_established = 432000 (432000 is 5 days).
Every time we have seen the client side disappear without any packet shown on tcpdump the connection has been up for more than 5 days. Whenever they restart the client application before 5 days we see the FIN and RST packet being sent from the client. The RST makes sense because we are killing the client with kill -9 so there is no graceful shutdown.
I have trying to reproduce this on a local VCloud system I have here running the same version of RedHat. I am learning how to use CONNTRACK.
So I am seeing: [NEW] tcp 6 120 SYN_SENT src=127.0.0.1 dst=127.0.0.1 sport=47818 dport=49286 [UNREPLIED] src=127.0.0.1 dst=127.0.0.1 sport=49286 dport=47818 [UPDATE] tcp 6 60 SYN_RECV src=127.0.0.1 dst=127.0.0.1 sport=47818 dport=49286 src=127.0.0.1 dst=127.0.0.1 sport=49286 dport=47818 [UPDATE] tcp 6 120 ESTABLISHED src=127.0.0.1 dst=127.0.0.1 sport=47818 dport=49286 src=127.0.0.1 dst=127.0.0.1 sport=49286 dport=47818 [ASSURED] [DESTROY] tcp 6 src=127.0.0.1 dst=127.0.0.1 sport=47818 dport=49286 src=127.0.0.1 dst=127.0.0.1 sport=49286 dport=47818 [ASSURED]
I believe the [DESTROY] is where my connections are being dropped. but it still shows up on the netstat tcp 0 0 127.0.0.1:47818 127.0.0.1:49286 ESTABLISHED 1265/SCADA_SCADA_EM tcp 0 0 127.0.0.1:49286 127.0.0.1:47818 ESTABLISHED 32490/CFGCTRL_CFGCT
Is there a reason why conntrack is not killing the connection? I am thinking it may be a configuration difference between my test machine and the customers machine. On their machine I suspect it really is being destroyed. It would be nice if I could reproduce it here...
Perhaps I am barking up the wrong tree...
PS. We are pretty sure a keepalive will fix the problem but both we and our customer wants an explanation of what is happening and why have we not seen this before. Note: We did not configure their system but they have assured us that they have no rules in iptable filtering traffic on 127.0.0.1
Nice find, that could be the issue. You could confirm it by increasing that timeout and observing, or by temporarily disabling iptables.
netfilter's conntrack state and the actual socket state are two different things. conntrack won't actively kill established sessions, it just won't let them pass any relevant firewall rules.
I assume your customer is allowing ctstate RELATED,ESTABLISHED and matching everything else on ctstate NEW which an expired existing connection won't match, so is denied by the firewall.
If the firewall is silently dropping traffic, rejecting with a TCP Reset should at least allow the TCP session to end.
Failing that, I'd follow the traffic with captures as I said above, and determine at which point the FIN from the active closer disappears.
I have verified that the way the customer had the iptables configured it would drop all outgoing packets unless the connection was in the NEW or ESTABLISHED state. This explains everything - I was sure it was a firewall issue but they insisted it was not. iptables -L does not lie.
I appreciate your suggestions and I finally learned how IPTABLES and NET FITLERING works. The documentation is more complicated than the software. My big break was to get conntrack tools working so I could see what was goingon.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
