RHEL6.9: NFSv4 TCP transport stuck in FIN_WAIT_2 forever

Solution Verified - Updated -

Issue

After an NFS4 server crashed/restarted, the NFS4 client mounts hung and did not recover. While networking connectivity has been restored (checked by 'ping'), NFS-shares do not recover on the NFS client. The TCP transport is stuck in FIN_WAIT_2 on the NFS client.

Environment

  • Red Hat Enterprise Linux 6.9 (NFS client)
    • kernel between 2.6.32-696.el6 and before kernel-2.6.32-696.10.1.el6
  • NFSv4.0 or NFSv3
  • Seen with Solaris NFS server
  • Seen with Linux NFS server
  • NOTE: A necessary condition for this failure is the NFS server does a TCP half-close (i.e. never sends a final FIN), which is likely due to either a bug on the NFS server (i.e. NFS server crash, etc), or some networking environment such as a firewall that strips a FIN from the TCP teardown sequence. This issue was reproduced with a carefully timed simulated outage (kernel crash) of a Linux NFS server.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content