When an NFS4 lease expires, will NFS client using locks with in-progress READ / WRITE fail with EIO or will it be re-sent to NFS server?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 5, 6, 7, 8 (NFS4 client)
  • NFS4
  • Application using flock or fcntrl locks to protect writes to files on NFS4

Issue

  • Does an application with open files on NFS4 get notified via EIO or some other method if an NFS4 lease expires?
  • When an NFS4 lease expires, does the NFS4 client attempt to reclaim locks lost and/or re-open files and re-issue IO?
  • I'm concerned about data corruption if an NFS4 lease expires. If an application uses flock / fcntl locks to protect concurrent writes to an NFS4 file, if a lease expires due to some event such as a network partition, and the lock is obtained by another node, will the first node still be able to write to the file or will the IO fail with an error?
  • If an NFS I/O operation completes with NFS4ERR_BAD_STATEID (10025), will the I/O be retried or failed?

Resolution

Red Hat Enterprise Linux 5

  • All versions attempt to reclaim locks which were due to lease expiration, and as a result no notification is given to an application and IO may be re-tried, leading to possible data corruption if another node obtains the lock during the lease expiration and writes to the file.
  • In the case of an I/O receiving NFS4ERR_BAD_STATEID (10025), the I/O is failed with EIO (errno == 5) back to the application without retrying.

Red Hat Enterprise Linux 6

  • The behavior of the NFS4 client is different depending on which RHEL6 kernel is being run.
    • Kernel prior to kernel-2.6.32-431.28.1.el6: The NFS client attempts to reclaim locks which were lost due to a lease expiration event and an application would not get notified of the lease expiration via invalidated filehandles or errors in system calls. This could result in file corruption if the file was modified in the meantime.
    • kernel-2.6.32-431.28.1.el6 or above: The NFS client does not attempt to reclaim locks which were lost due to a lease expiration event, and an application is notified via EIO or EBADF errors on system calls such as write. In addition, a module parameter, 'recover_lost_locks', is introduced which allows reversion to the previous behavior if desired. By default, this parameter is set to 'N'. If the previous behavior is desired, the parameter should be set to 'Y' (for example, # echo Y > /sys/module/nfs/parameters/recover_lost_locks).
  • The behavior on any RHEL6 kernel when an I/O receives NFS4ERR_BAD_STATEID (10025) is the same as when a lease expires. That is, if the kernel is configured to retry upon lease expiration, it will retry upon NFS4ERR_BAD_STATEID.

  • The following private Red Hat bugs cover the above change.

    • Bug 1089359 - NFS v4 file lock inefffective following implicit loss of lock, leading to file corruption [rhel-6.5.z]
    • Bug 963785 - NFS v4 file lock inefffective following implicit loss of lock, leading to file corruption

Red Hat Enterprise Linux 7 and Red Hat Enterprise Linux 8

  • All versions contain the upstream patches mentioned in the Root Cause section. By default, the NFS client does not attempt to reclaim locks which were lost due to a lease expiration event, and an application is notified via EIO errors on system calls such as write. The module parameter 'recover_lost_locks' can be used to change the behavior if desired. Enabling 'recover_lost_locks' may cause data corruption.
  • The behavior on any RHEL7 or RHEL8 kernel when an I/O receives NFS4ERR_BAD_STATEID (10025) is the same as when a lease expires. That is, if the kernel is configured to retry upon lease expiration, it will retry upon NFS4ERR_BAD_STATEID.

Root Cause

  • The kernel.org (upstream) kernel has been changed to not re-try IO upon loss of lock and certain errors which indicate a stateid has become invalid. Attempting to re-establish state on these conditions without notifying the application that such an error occurred was deemed undesirable as it may cause data corruption.
  • The behavior of the NFS client has been changed by the following upstream patches. These patches have been backported to RHEL6 to address private Red Hat bug 963785 and 1089359.
commit ef1820f9be27b6ad158f433ab38002ab8131db4d
Author: NeilBrown <neilb@suse.de>
Date:   Wed Sep 4 17:04:49 2013 +1000

    NFSv4: Don't try to recover NFSv4 locks when they are lost.

    When an NFSv4 client loses contact with the server it can lose any
    locks that it holds.

    Currently when it reconnects to the server it simply tries to reclaim
    those locks.  This might succeed even though some other client has
    held and released a lock in the mean time.  So the first client might
    think the file is unchanged, but it isn't.  This isn't good.

    If, when recovery happens, the locks cannot be claimed because some
    other client still holds the lock, then we get a message in the kernel
    logs, but the client can still write.  So two clients can both think
    they have a lock and can both write at the same time.  This is equally
    not good.

    There was a patch a while ago
      http://comments.gmane.org/gmane.linux.nfs/41917

    which tried to address some of this, but it didn't seem to go
    anywhere.  That patch would also send a signal to the process.  That
    might be useful but for now this patch just causes writes to fail.

    For NFSv4 (unlike v2/v3) there is a strong link between the lock and
    the write request so we can fairly easily fail any IO of the lock is
    gone.  While some applications might not expect this, it is still
    safer than allowing the write to succeed.

    Because this is a fairly big change in behaviour a module parameter,
    "recover_locks", is introduced which defaults to true (the current
    behaviour) but can be set to "false" to tell the client not to try to
    recover things that were lost.

    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

commit f6de7a39c181dfb8a2c534661a53c73afb3081cd
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date:   Wed Sep 4 10:08:54 2013 -0400

    NFSv4: Document the recover_lost_locks kernel parameter

    Rename the new 'recover_locks' kernel parameter to 'recover_lost_locks'
    and change the default to 'false'. Document why in
    Documentation/kernel-parameters.txt

    Move the 'recover_lost_locks' kernel parameter to fs/nfs/super.c to
    make it easy to backport to kernels prior to 3.6.x, which don't have
    a separate NFSv4 module.

    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments