RHEL 6.9 nfs or lockd server not responding - due to TCP 3-way handshake failure during reconnect, NFS client erroneously sends burst of SYNs

Solution Verified - Updated -

Issue

  • A RHEL 6.9 client fails to reconnect to the NFS server and server not responding messages are seen. At the TCP level, the second step (the SYN,ACK) of the 3-way TCP handshake is failing with ICMP 102 Destination unreachable (Host administratively prohibited) sent by the NFS client.
  • NFS share cannot reconnect due to the 3-way TCP handshake failure. The NFS client erroneously sends multiple SYN packets from the same TCP port but different sequence numbers, the NFS server responds with SYN,ACK to one of the SYNs, but not the others, and this sequence leads to confusion between the NFS client and server's TCP stacks. As a result of the confusion, the NFS client never sends the final ACK and so the NFS share cannot be reconnected, leading to a DoS of the NFS share.
  • NFS client TCP connection goes idle and disconnects, and upon attempt to re-use the NFS share, the share is not usable and the following errors are seen: xs_tcp_setup_socket: connect returned unhandled error -107

Environment

  • Red Hat Enterprise Linux 6.9
    • kernels 2.6.32-696.el6 up to, but not including 2.6.32-696.6.3.el6
    • Fixed in 2.6.32-696.6.3.el6
  • NFS
    • seen with NFSv3 and lockd traffic
    • so far this issue has not been seen with NFSv4.0 or NFS4.1
  • seen with iptables / nf_conntrack
    • NOTE: It is possible iptables is not a necessary condition for this problem to occur, and there may be other failure modes unrelated to iptables

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In