RHEL 6 NFS server panics in lru_put_end called from nfsd_cache_lookup

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 6.5
  • NFS Server
  • Observed in the following kernels, though other 6.5 kernels may be affected:
    • 2.6.32-431.1.2.el6
    • 2.6.32-431.17.1.el6

Issue

  • kernel panics with nfsd_cache_lookup calling lru_put_end
general protection fault: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/online
CPU 18 
Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables nfs fuse mptctl mptbase dell_rbu nfsd lockd nfs_acl auth_rpcgss autofs4 sunrpc cachefiles fscache(T) bonding 8021q garp stp llc ipv6 ipt_REJECT xfs exportfs scsi_dh_rdac dm_round_robin dm_multipath uinput ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas microcode power_meter sg shpchp bnx2x libcrc32c mdio sb_edac edac_core lpc_ich mfd_core ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nf_conntrack]

Pid: 6777, comm: nfsd Tainted: G           ---------------  T 2.6.32-431.1.2.el6.x86_64 #1 Dell Inc. PowerEdge R720/0C4Y3R
RIP: 0010:[<ffffffffa05bbf90>]  [<ffffffffa05bbf90>] lru_put_end+0x20/0x50 [nfsd]
RSP: 0018:ffff88081ea51d50  EFLAGS: 00010286
RAX: dead000000200200 RBX: ffff88081e956000 RCX: ffff88081e956028
RDX: dead000000100100 RSI: ffff88070193d908 RDI: ffff88070193d918
RBP: ffff88081ea51d50 R08: ffff88070193d918 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000006 R12: ffff88070193d908
R13: 0000000000000000 R14: ffff88070193d908 R15: 0000000000000002
FS:  0000000000000000(0000) GS:ffff880044720000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007fdb8efe9000 CR3: 0000000001a85000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process nfsd (pid: 6777, threadinfo ffff88081ea50000, task ffff88082fee3540)
Stack:
 ffff88081ea51dd0 ffffffffa05bc4a6 ffffffffa0508f78 ffff88070193d92c
<d> ffff88081e956028 ffffffffdbccad6a 02b588081e957a28 ffffffffa05e2f40
<d> 00000006269f319b 0000000700000003 ffff88081ea51dd0 ffff88081e956000
Call Trace:
 [<ffffffffa05bc4a6>] nfsd_cache_lookup+0x3a6/0x700 [nfsd]
 [<ffffffffa0508f78>] ? svc_authenticate+0xc8/0x170 [sunrpc]
 [<ffffffffa05b03e8>] nfsd_dispatch+0xa8/0x230 [nfsd]
 [<ffffffffa0505844>] svc_process_common+0x344/0x640 [sunrpc]
 [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
 [<ffffffffa0505e80>] svc_process+0x110/0x160 [sunrpc]
 [<ffffffffa05b0b52>] nfsd+0xc2/0x160 [nfsd]
 [<ffffffffa05b0a90>] ? nfsd+0x0/0x160 [nfsd]
 [<ffffffff8109af06>] kthread+0x96/0xa0
 [<ffffffff8100c20a>] child_rip+0xa/0x20
 [<ffffffff8109ae70>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
Code: 66 66 90 48 83 c4 08 5b c9 c3 90 55 48 89 e5 0f 1f 44 00 00 48 8b 05 40 19 65 e1 48 8b 57 10 48 89 47 58 48 8b 47 18 48 83 c7 10 <48> 89 42 08 48 89 10 48 c7 c2 50 2f 5e a0 48 8b 35 b3 6f 02 00 
RIP  [<ffffffffa05bbf90>] lru_put_end+0x20/0x50 [nfsd]
 RSP <ffff88081ea51d50>

Resolution

Root Cause

We can end up designating a DRC entry for reuse and then subsequently free that entry in some cases before reusing it which results in this bug. Two patches are proposed to fix the problem in the new duplicate reply cache code.

The DRC code will attempt to reuse an existing, expired cache entry in preference to allocating a new one. It'll then search the cache, and if it gets a hit it'll then free the cache entry that it was going to reuse.

The cache code doesn't unhash the entry that it's going to reuse however, so it's possible for it end up designating an entry for reuse and then subsequently freeing the same entry after it finds it. This leads it to a later use-after-free situation and usually some list corruption warnings or an oops.

The first patch fixes this by simply unhashing the entry that we intend to reuse. That will mean that it's not findable via a search and should prevent this situation from occurring.

With the first patch, when we were processing a request, we tried to scrape an expired or over-limit entry off the list in preference to allocating a new one from the slab. This was unnecessarily complicated; using the slab looked to be a better approach. The second patch corrects this.

The first patch in 6.5 kernel series was to update to kernel-2.6.32-431.20.3.el6 or later. See errata RHSA-2014-0771 for details.

The final resolution came with the 6.6, kernel-2.6.32-504.el6 or later. See errata RHSA-2014-1392 for details.

The following bug as associated with this case: Bug 1036972 - use after free in new nfsd DRC code

Diagnostic Steps

Simple verification steps (from kernel log / oops message)

1) Check the symbol from the "RIP:" line to see if it matches lru_put_end

RIP: 0010:[<ffffffffa057af90>]  [<ffffffffa057af90>] lru_put_end+0x20/0x50 [nfsd]

2) Check the kernel version from the "Pid:" line to see if it is earlier than one of the versions in the Resolution section of this article

Pid: 4636, comm: nfsd Not tainted 2.6.32-431.3.1.el6.x86_64 #1 Dell Inc. PowerEdge R620/0PXXHP

3) Check the first line underneath "Call Trace:" and see if it matches nfsd_cache_lookup. NOTE: If this line matches nfsd_cache_update, it may match https://access.redhat.com/solutions/1254403 instead.

Call Trace:
 [<ffffffffa057b4a6>] nfsd_cache_lookup+0x3a6/0x700 [nfsd]

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments