Using NFS4 delegations and seeing "Lock reclaim failed" messages

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux 6 (NFS4 client)
    • all kernels prior to kernel-2.6.32-573.el6
  • NFS4 with delegations

Issue

Resolution

Root Cause

  • Once verified the underlying reason for the message is due to a delegation, the "Lock reclaim failed" messages can be safely ignored.
  • The upstream commit (6686390bab6a0e049fa7040631aee08b35a55293) which attempted to address this explains the problem and why the message should not be printed if a delegation is held. Note that a second commit (1acd1c301f4faae80f4d2c7bbd9a4553b131c0e3) is necessary to fix the test_bit logic and actually remove the erroneous warning.
commit 6686390bab6a0e049fa7040631aee08b35a55293
Author: NeilBrown <neilb@suse.de>
Date:   Mon Aug 12 16:52:47 2013 +1000

    NFS: remove incorrect "Lock reclaim failed!" warning.

    After reclaiming state that was lost, the NFS client tries to reclaim
    any locks, and then checks that each one has NFS_LOCK_INITIALIZED set
    (which means that the server has confirmed the lock).
    However if the client holds a delegation, nfs_reclaim_locks() simply aborts
    (or more accurately it called nfs_lock_reclaim() and that returns without
    doing anything).

    This is because when a delegation is held, the server doesn't need to
    know about locks.

    So if a delegation is held, NFS_LOCK_INITIALIZED is not expected, and
    its absence is certainly not an error.

    So don't print the warnings if NFS_DELGATED_STATE is set.

commit 1acd1c301f4faae80f4d2c7bbd9a4553b131c0e3
Author: Jeff Layton <jlayton@redhat.com>
Date:   Thu Oct 31 13:03:04 2013 -0400

    nfs: fix inverted test for delegation in nfs4_reclaim_open_state

    commit 6686390bab6a0e0 (NFS: remove incorrect "Lock reclaim failed!"
    warning.) added a test for a delegation before checking to see if any
    reclaimed locks failed. The test however is backward and is only doing
    that check when a delegation is held instead of when one isn't.

Diagnostic Steps

  • Check for the usage of NFS4 delegations using a tcpdump or some other method. If delegations are not in use, this is not the reason for the "Lock reclaim failed", and further diagnosics should be done.
  • If delegations are confirmed to be in use, proceed with further troubleshooting.
  • This issue is probably is best verified with a systemtap script which implements the test_bit(NFS_DELEGATED_STATE, &state->flags) check inside nfs4_reclaim_open_state and before the printing of the "Lock reclaim failed" message.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments