RHEL8.1: NFS client hangs with kworker in D-state stack inside writeback and nfs_free_request .. __put_nfs_open_context
Issue
We're migrating workflows from RHEL6 to RHEL8. We have been using drop_caches to improve performance on RHEL6 with no problems. But when using RHEL8, we're seeing drop_caches processes hung.
Here is a sample host .. seeing are two processes in 'D' state:
$ ps -e -opid,lstart,s,comm,cmd | sort -k7,7 | head
PID STARTED S COMMAND CMD
46748 Fri Aug 7 08:22:00 2020 D drop_caches /bin/sh /foo/bin/drop_caches 1
54771 Fri Aug 7 07:04:43 2020 D kworker/u113:5+ [kworker/u113:5+flush-0:63]
with these stacks:
$ sudo cat /proc/46748/stack
[<0>] iterate_supers+0x7f/0x100
[<0>] drop_caches_sysctl_handler+0x54/0x7c
[<0>] proc_sys_call_handler+0xab/0x100
[<0>] vfs_write+0xa5/0x1a0
[<0>] ksys_write+0x4f/0xb0
[<0>] do_syscall_64+0x5b/0x1b0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
[<0>] 0xffffffffffffffff
$ sudo cat /proc/54771/stack
[<0>] deactivate_super+0x43/0x50
[<0>] __put_nfs_open_context+0xd8/0x110 [nfs]
[<0>] nfs_free_request+0xb7/0x170 [nfs]
[<0>] nfs_page_group_destroy+0x36/0x60 [nfs]
[<0>] nfs_do_writepage+0x1d7/0x310 [nfs]
[<0>] nfs_writepages_callback+0xf/0x20 [nfs]
[<0>] write_cache_pages+0x1a5/0x400
[<0>] nfs_writepages+0xb4/0x180 [nfs]
[<0>] do_writepages+0x41/0xd0
[<0>] __writeback_single_inode+0x3d/0x360
[<0>] writeback_sb_inodes+0x1e3/0x450
[<0>] __writeback_inodes_wb+0x5d/0xb0
[<0>] wb_writeback+0x25f/0x2f0
[<0>] wb_workfn+0x186/0x400
[<0>] process_one_work+0x1a7/0x3b0
[<0>] worker_thread+0x30/0x390
[<0>] kthread+0x112/0x130
[<0>] ret_from_fork+0x35/0x40
[<0>] 0xffffffffffffffff
Environment
- Red Hat Enterprise Linux 8.1 (NFS client)
- seen on kernel-4.18.0-147.3.1.el8_1.x86_64
- NFS client that encounters a non-recoverable error during writeback
- For example, EDQUOT, ENOSPC
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.