Stale file handle on NFS client during failover
Issue
We have a pair of servers. Both running RHEL7.9. These are running NFS server services. There is shared storage between them. One server exports three filesystems to a number of clients (RHEL7.4). If the primary server fails then the following happens:
- The storage is brought online on the other server
- The filesystems are mounted
- The NFS service is restarted
- The filesystems are shared out with the same FSID
- The NFS server is restarted
- The IP used for NFS clients is brought online
NOTE: The above failover is done using Veritas Cluster Server (VCS). Note VCS has a mechanism for copying the content of /var/statmon/sm
and /var/lib/nfs
from one server to another to preserve state.
When a controlled failover occurs (primary server is powered off) then the clients see the share freeze and then become available again but shortly after we start to see Stale file handle messages mainly with the "df" command.
Environment
Red Hat Enterprise Linux 7
Red Hat Enterprise Linux 8
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.