RHEL6.9: NFSv4 TCP transport stuck in FIN_WAIT_2 forever

Solution Verified - Updated -

Issue

After an NFS4 server crashed/restarted, the NFS4 client mounts hung and did not recover. While networking connectivity has been restored (checked by 'ping'), NFS-shares do not recover on the NFS client. The TCP transport is stuck in FIN_WAIT_2 on the NFS client.

Environment

  • Red Hat Enterprise Linux 6.9 (NFS client)
    • kernel between 2.6.32-696.el6 and before kernel-2.6.32-696.10.1.el6
  • NFSv4.0 or NFSv3
  • Seen with Solaris NFS server
  • Seen with Linux NFS server
  • NOTE: A necessary condition for this failure is the NFS server does a TCP half-close (i.e. never sends a final FIN), which is likely due to either a bug on the NFS server (i.e. NFS server crash, etc), or some networking environment such as a firewall that strips a FIN from the TCP teardown sequence. This issue was reproduced with a carefully timed simulated outage (kernel crash) of a Linux NFS server.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In