RHEL6.9: NFSv4 TCP transport stuck in FIN_WAIT_2 forever
Issue
After an NFS4 server crashed/restarted, the NFS4 client mounts hung and did not recover. While networking connectivity has been restored (checked by 'ping'), NFS-shares do not recover on the NFS client. The TCP transport is stuck in FIN_WAIT_2 on the NFS client.
Environment
- Red Hat Enterprise Linux 6.9 (NFS client)
- kernel between 2.6.32-696.el6 and before kernel-2.6.32-696.10.1.el6
- NFSv4.0 or NFSv3
- Seen with Solaris NFS server
- Seen with Linux NFS server
- NOTE: A necessary condition for this failure is the NFS server does a TCP half-close (i.e. never sends a final FIN), which is likely due to either a bug on the NFS server (i.e. NFS server crash, etc), or some networking environment such as a firewall that strips a FIN from the TCP teardown sequence. This issue was reproduced with a carefully timed simulated outage (kernel crash) of a Linux NFS server.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.