RHEL5: soft lockup in nfs4_reclaim_open_state called from reclaimer after NFSv4 server became unavailable

Solution Verified - Updated -

Issue

  • NFSv4 client hung after the NFSv4 server, a NetApp Vfiler head, went through failback, with a large number of processes entering an uninterruptable state.
  • System had a very high load average, due to the many processes in uninterruptible state.
  • Messages similar to the following are seen in the log
Dec 21 21:21:29 linux kernel: nfs4_reclaim_open_state: unhandled error -116. Zeroing state
Dec 21 21:21:29 linux kernel: nfs4_reclaim_open_state: unhandled error -10026. Zeroing state
  • Kernel oops message similar to the following:
BUG: soft lockup - CPU#2 stuck for 60s! [10.52.18.23-rec:11854]
...
Pid: 11854, comm: 5.25.81.32-rec Not tainted 2.6.18-238.9.1.el5 #1
RIP: 0010:[<ffffffff88614dfa>]  [<ffffffff88614dfa>] :nfs:nfs4_reclaim_open_state+0x135/0x150
...
Call Trace:
 ffffffff88614fb9 :nfs:reclaimer+0x1a4/0x2ac
 ffffffff88614e15 :nfs:reclaimer+0x0/0x2ac
 ffffffff80032afc kthread+0xfe/0x132
 ffffffff8005dfb1 child_rip+0xa/0x11
 ffffffff800a26db keventd_create_kthread+0x0/0xc4
 ffffffff800329fe kthread+0x0/0x132

Environment

  • Red Hat Enterprise Linux (RHEL) 5
    • NFSv4 client
  • Often seen with NetApp filers, and Ontap 8.1.2P3 7-Mode

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content