RHEL5: soft lockup in nfs4_reclaim_open_state called from reclaimer after NFSv4 server became unavailable
Issue
- NFSv4 client hung after the NFSv4 server, a NetApp Vfiler head, went through failback, with a large number of processes entering an uninterruptable state.
- System had a very high load average, due to the many processes in uninterruptible state.
- Messages similar to the following are seen in the log
Dec 21 21:21:29 linux kernel: nfs4_reclaim_open_state: unhandled error -116. Zeroing state
Dec 21 21:21:29 linux kernel: nfs4_reclaim_open_state: unhandled error -10026. Zeroing state
- Kernel oops message similar to the following:
BUG: soft lockup - CPU#2 stuck for 60s! [10.52.18.23-rec:11854]
...
Pid: 11854, comm: 5.25.81.32-rec Not tainted 2.6.18-238.9.1.el5 #1
RIP: 0010:[<ffffffff88614dfa>] [<ffffffff88614dfa>] :nfs:nfs4_reclaim_open_state+0x135/0x150
...
Call Trace:
ffffffff88614fb9 :nfs:reclaimer+0x1a4/0x2ac
ffffffff88614e15 :nfs:reclaimer+0x0/0x2ac
ffffffff80032afc kthread+0xfe/0x132
ffffffff8005dfb1 child_rip+0xa/0x11
ffffffff800a26db keventd_create_kthread+0x0/0xc4
ffffffff800329fe kthread+0x0/0x132
Environment
- Red Hat Enterprise Linux (RHEL) 5
- NFSv4 client
- Often seen with NetApp filers, and Ontap 8.1.2P3 7-Mode
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
