RHEL6.5: NFSv4 mounts hung while trying "touch" or "cat" files, nfs4 state manager in nfs4_do_reclaim -> msleep loop
Issue
- System has high load, mostly if not all due to blocked / D-state processes.
- An nfs4 state manager thread is stuck running indefinitely in a loop with an msleep from nfs4_do_reclaim, similar to the following partial backtrace
#3 [ffff8804375fdde8] msleep at ffffffff810863a0
#4 [ffff8804375fddf8] nfs4_do_reclaim at ffffffffa05d3112 [nfs]
#5 [ffff8804375fde88] nfs4_run_state_manager at ffffffffa05d330f [nfs]
- NFSv4 mounts getting hung while trying to "touch" or "cat" files on Red Hat Enterprise Linux 6.5, with the following
backtrace
PID: 9577 TASK: ffff880438079540 CPU: 0 COMMAND: "sh"
#0 [ffff8803256677d8] schedule at ffffffff815278c2
#1 [ffff8803256678a0] nfs_wait_bit_killable at ffffffffa05a4f72 [nfs]
#2 [ffff8803256678b0] __wait_on_bit at ffffffff81528b6f
#3 [ffff880325667900] out_of_line_wait_on_bit at ffffffff81528c18
#4 [ffff880325667970] nfs4_wait_clnt_recover at ffffffffa05d1972 [nfs]
#5 [ffff880325667990] nfs4_client_recover_expired_lease at ffffffffa05d1d60 [nfs]
#6 [ffff8803256679b0] _nfs4_do_open at ffffffffa05bfac5 [nfs]
#7 [ffff880325667a90] nfs4_do_open at ffffffffa05bffe5 [nfs]
#8 [ffff880325667b30] nfs4_atomic_open at ffffffffa05c00f8 [nfs]
#9 [ffff880325667b50] nfs_open_revalidate at ffffffffa05a1d6c [nfs]
#10 [ffff880325667bf0] do_lookup at ffffffff811988f6
#11 [ffff880325667c50] __link_path_walk at ffffffff81199354
#12 [ffff880325667d30] path_walk at ffffffff81199e6a
#13 [ffff880325667d70] filename_lookup at ffffffff8119a07b
#14 [ffff880325667db0] do_filp_open at ffffffff8119b554
#15 [ffff880325667f20] do_sys_open at ffffffff81185d29
#16 [ffff880325667f70] sys_open at ffffffff81185e40
#17 [ffff880325667f80] system_call_fastpath at ffffffff8100b072
RIP: 000000316dcdb540 RSP: 00007fff7bb42b30 RFLAGS: 00010206
RAX: 0000000000000002 RBX: ffffffff8100b072 RCX: 00000000025ce000
RDX: 000000000068732e RSI: 0000000000000000 RDI: 00000000025cdde0
RBP: 00000000025cdde0 R8: 0000000000000002 R9: 0000000000000010
R10: 0000000000000008 R11: 0000000000000246 R12: ffffffff81185e40
R13: ffff880325667f78 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b
Environment
- Red Hat Enterprise Linux 6.5 (NFS client)
- kernel prior to `kernel-2.6.32-431.37.1.el6
- NFSv4 with delegations enabled
- NFS Server
- any NFS4 server with delegations enabled may be affected
- seen on EMC Storage VNX 5300 (NFS server), NAS version:7.1.71-1
- seen on IBM Storage N6210
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.