RHEL6: NFS4 server incorrectly returning NFS4ERR_EXPIRED to WRITE due to wraparound of current_fileid leads to infinite protocol loop with NFS4 client
Issue
- NFS4 client hangs to an NFS server that has been up a long time or has handled OPENs for 2^32 or more files
- Linux NFS4 client hangs on a single WRITE with the Linux NFS4 server returning NFS4ERR_EXPIRED to the WRITE repeatedly even after error recovery.
- In the Linux NFS server the current_fileid is a 32-bit counter and if it wraps around, it you can get into an infinite protocol loop of:
- WRITE / NFS4ERR_EXPIRED
- RENEW / NFS4_OK
- OPEN / NFS4_OK (with new stateid, same clientid used in the open that was sent in the RENEW)
- WRITE / NFS4ERR_EXPIRED (uses new stateid returned in the OPEN reply)
Environment
- Red Hat Enterprise Linux 6 (NFS server)
- any kernel prior to kernel-2.6.32-675.el6
- any kernel prior to kernel-2.6.32-642.15.1.el6
- seen on 2.6.32-573.1.1.el6 (a rhel6.7.z kernel) and other RHEL6 kernels
- NFS4
- RHEL used as NFS server
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
