Kernel crashes while dereferencing a null pointer in vfs_setlease

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 8.3
  • NFS

Issue

  • Kernel crashes while dereferencing a null pointer in vfs_setlease
  • Kernel panic occurred with message BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
Jan 23 10:41:36 hostname kernel: WARNING: CPU: 33 PID: 939775 at fs/nfsd/nfs4state.c:5270 laundromat_main+0x33e/0x6d0 [nfsd]
Jan 23 10:41:36 hostname kernel: Modules linked in: rpcsec_gss_krb5 md4 sha512_ssse3 sha512_generic cmac nls_utf8 cifs libarc4 dns_resolver nfsd auth_rpcgss nfs_acl lockd grace rpcrdma ib_isert iscsi_target_mod ib_iser ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core overlay tcp_diag udp_diag raw_diag inet_diag mmfs26(OE) mmfslinux(OE) tracedev(OE) xt_conntrack nft_counter xt_REDIRECT nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink dm_round_robin sd_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc vfat fat ext4 mbcache jbd2 dm_multipath intel_rapl_msr intel_rapl_common isst_if_common nfit libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support intel_rapl_perf joydev pcspkr lpc_ich i2c_i801 virtio_balloon binfmt_misc ip_tables xfs libcrc32c sr_mod cdrom sg ahci libahci crc32c_intel serio_raw libata qxl drm_ttm_helper ttm drm_kms_helper syscopyarea
Jan 23 10:41:36 hostname kernel:  sysfillrect sysimgblt fb_sys_fops virtio_blk drm virtio_console virtio_net net_failover failover dm_mirror dm_region_hash dm_log dm_mod
Jan 23 10:41:36 hostname kernel: CPU: 5 PID: 1387908 Comm: kworker/u128:2 Kdump: loaded Tainted: G        W  OE    --------- -  - 4.18.0-240.1.1.el8_3.x86_64 #1
Jan 23 10:41:36 hostname kernel: Hardware name: Red Hat RHEL/RHEL-AV, BIOS 0.0.0 02/06/2015
Jan 23 10:41:36 hostname kernel: Workqueue: nfsd4 laundromat_main [nfsd]
Jan 23 10:41:36 hostname kernel: RIP: 0010:laundromat_main+0x33e/0x6d0 [nfsd]
Jan 23 10:41:36 hostname kernel: Code: 49 8b 16 4c 39 74 24 18 74 24 49 8b 46 20 49 8d 7e a8 4d 89 f7 49 39 c4 0f 8c 46 03 00 00 49 89 d6 e8 46 df ff ff 84 c0 75 af <0f> 0b eb ab 48 8b 5c 24 28 48 c7 c7 40 7b 05 c1 e8 7d 45 2f d3 66
Jan 23 10:41:36 hostname kernel: RSP: 0018:ffffa6c5cdcf7e10 EFLAGS: 00010246
Jan 23 10:41:36 hostname kernel: RAX: 0000000000000000 RBX: ffff8e5f2216a4d8 RCX: ffffffff96087080
Jan 23 10:41:36 hostname kernel: RDX: ffff8e5db40af0b0 RSI: 0000000000000000 RDI: ffff8e5db40af078
Jan 23 10:41:36 hostname kernel: RBP: ffffa6c5cdcf7e50 R08: 0000000000000000 R09: ffff8e5f2216a498
Jan 23 10:41:36 hostname kernel: R10: 8080808080808080 R11: 0000000000000000 R12: 000000000007c686
Jan 23 10:41:36 hostname kernel: R13: 000000000000005a R14: ffff8e5f2216a4b8 R15: ffff8e5db40af0d0
Jan 23 10:41:36 hostname kernel: FS:  0000000000000000(0000) GS:ffff8e631ed40000(0000) knlGS:0000000000000000
Jan 23 10:41:36 hostname kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 10:41:36 hostname kernel: CR2: 00007f801aa59030 CR3: 000000077800a001 CR4: 00000000007606e0
Jan 23 10:41:36 hostname kernel: PKRU: 55555554
Jan 23 10:41:36 hostname kernel: Call Trace:
Jan 23 10:41:36 hostname kernel:  process_one_work+0x1a7/0x360
Jan 23 10:41:36 hostname kernel:  worker_thread+0x30/0x390
Jan 23 10:41:36 hostname kernel:  ? create_worker+0x1a0/0x1a0
Jan 23 10:41:36 hostname kernel:  kthread+0x112/0x130
Jan 23 10:41:36 hostname kernel:  ? kthread_flush_work_fn+0x10/0x10
Jan 23 10:41:36 hostname kernel:  ret_from_fork+0x35/0x40
Jan 23 10:41:36 hostname kernel: ---[ end trace e398cc78e1473e66 ]---
Jan 23 10:41:36 hostname kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

Resolution

Possible Workaround

Root Cause

This issue was due to the following patch that was not applied in RHEL 8:

commit 548ec0805c399c65ed66c6641be467f717833ab5
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Mon Nov 29 15:08:00 2021 -0500

    nfsd: fix use-after-free due to delegation race

    A delegation break could arrive as soon as we've called vfs_setlease.  A
    delegation break runs a callback which immediately (in
    nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru.  If we
    then exit nfs4_set_delegation without hashing the delegation, it will be
    freed as soon as the callback is done with it, without ever being
    removed from del_recall_lru.

    Symptoms show up later as use-after-free or list corruption warnings,
    usually in the laundromat thread.

    I suspect aba2072f4523 "nfsd: grant read delegations to clients holding
    writes" made this bug easier to hit, but I looked as far back as v3.0
    and it looks to me it already had the same problem.  So I'm not sure
    where the bug was introduced; it may have been there from the beginning.

    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
  • At 1873, we see a crash at execution of if (filp->f_op->setlease).
    This value of filp comes from vfs_setlease(fp->fi_deleg_file, F_UNLCK, NULL, (void **)&dp);
  • The stateid coming from nfs4_file.fi_stateids (if existing else new stateid is created) then file in the nfs4_open is released, so the delegations to the file are also released.
   1871 vfs_setlease(struct file *filp, long arg, struct file_lock **lease, void **priv)
   1872 {
   1873         if (filp->f_op->setlease)
   1874                 return filp->f_op->setlease(filp, arg, lease, priv);
   1875         else
   1876                 return generic_setlease(filp, arg, lease, priv);
   1877 }

Diagnostic Steps

Following is a brief vmcore analysis :

  • Backtrace of panic task
crash> bt
PID: 939775  TASK: ffff9bacb0b8af80  CPU: 33  COMMAND: "nfsd"
 #0 [ffffbe592c3bba10] machine_kexec at ffffffff84a5bf3e
 #1 [ffffbe592c3bba68] __crash_kexec at ffffffff84b6072d
 #2 [ffffbe592c3bbb30] crash_kexec at ffffffff84b6160d
 #3 [ffffbe592c3bbb48] oops_end at ffffffff84a22d4d
 #4 [ffffbe592c3bbb68] no_context at ffffffff84a6ba9e
 #5 [ffffbe592c3bbbc0] do_page_fault at ffffffff84a6c5c2
 #6 [ffffbe592c3bbbf0] page_fault at ffffffff8540122e
    [exception RIP: vfs_setlease+5]
    RIP: ffffffff84d39385  RSP: ffffbe592c3bbca8  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff9bacb0b944a0  RCX: ffffbe592c3bbd10
    RDX: 0000000000000000  RSI: 0000000000000002  RDI: 0000000000000000
    RBP: ffffbe592c3bbd98   R8: 0000000000010bd5   R9: ffff9b65c1dafaa8
    R10: 0000000000000021  R11: 0000000000000000  R12: ffff9b65c1dafa80
    R13: ffff9b65c1dafa80  R14: 00000000fffffff5  R15: ffff9b6b7e0b4150
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffbe592c3bbca8] nfsd4_process_open2 at ffffffffc089a8cb [nfsd]
 #8 [ffffbe592c3bbda0] nfsd4_open at ffffffffc088859c [nfsd]
 #9 [ffffbe592c3bbde0] nfsd4_proc_compound at ffffffffc0888c53 [nfsd]
#10 [ffffbe592c3bbe40] nfsd_dispatch at ffffffffc087516e [nfsd]
#11 [ffffbe592c3bbe70] svc_process_common at ffffffffc0811c23 [sunrpc]
#12 [ffffbe592c3bbed8] svc_process at ffffffffc0812141 [sunrpc]
#13 [ffffbe592c3bbef0] nfsd at ffffffffc0874bc3 [nfsd]
#14 [ffffbe592c3bbf10] kthread at ffffffff84ad9502
#15 [ffffbe592c3bbf50] ret_from_fork at ffffffff85400242
  • Disassembling exception RIP
crash> dis -rl vfs_setlease+5
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/fs/locks.c: 1872
0xffffffff84d39380 <vfs_setlease>:  nopl   0x0(%rax,%rax,1) [FTRACE NOP]
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/fs/locks.c: 1873
0xffffffff84d39385 <vfs_setlease+5>:    mov    0x28(%rdi),%rax
  • Trying to find the value of dentry
crash> fregs -r nfsd4_process_open2

PID: 939775  TASK: ffff9bacb0b8af80  CPU: 33  COMMAND: nfsd

#7 nfsd4_process_open2 called from 0xffffffffc088859c <nfsd4_open+924>
 +R12: 0x0
 +R13: 0xffff9bacb0b90000
 +R14: 0xffff9b6e3a998800
 +R15: 0xffff9bacb0b96070
6 RAX: 0x0
 +RBP: 0xffff9bacb0b96070
 +RBX: 0xffff9bacb0b944a0
1 RDI: 0xffff9bacb0b90000
3 RDX: 0xffff9bacb0b944a0
2 RSI: 0xffff9bacb0b96070
crash>
crash> struct svc_fh 0xffff9bacb0b96070
struct svc_fh {
  fh_handle = {
  ...
  fh_maxsize = 128,
  fh_dentry = 0xffff9b6a67524300,
  fh_export = 0xffff9ba77f8d0a00,
  • Reading the file at dentry 0xffff9b6a67524300
crash> files -d  0xffff9b6a67524300
     DENTRY           INODE           SUPERBLK     TYPE PATH
ffff9b6a67524300 ffff9b6e76eea000 ffff9badec2af000 REG  /pfss/home/user/.wget-hsts
  • The affected mount point is
crash> mount | grep ffff9badec2af000
ffff9b6d1d9c4600 ffff9badec2af000 nfs   nfsshare      /nfsshare     
  • lease is enabled
crash> crashinfo --sysctl | grep leases
fs.leases-enable     1
  • Disassembling exception RIP with hexadecimal
crash> dis -lr ffffffffc0888c53 | tail
0xffffffffc0888c33 <nfsd4_proc_compound+851>:   mov    0x8(%rsp),%r10
0xffffffffc0888c38 <nfsd4_proc_compound+856>:   mov    0x8(%r10),%rax
0xffffffffc0888c3c <nfsd4_proc_compound+860>:   mov    %r10,0x8(%rsp)
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/fs/nfsd/nfs4proc.c: 2012
0xffffffffc0888c41 <nfsd4_proc_compound+865>:   mov    (%rax),%rax
0xffffffffc0888c44 <nfsd4_proc_compound+868>:   mov    %r13,%rsi
0xffffffffc0888c47 <nfsd4_proc_compound+871>:   mov    %rbp,%rdi
0xffffffffc0888c4a <nfsd4_proc_compound+874>:   mov    (%rsp),%rdx
0xffffffffc0888c4e <nfsd4_proc_compound+878>:   call   0xffffffff85601220 <__x86_indirect_thunk_rax>
0xffffffffc0888c53 <nfsd4_proc_compound+883>:   mov    0x8(%rsp),%r10
crash> 
ffffbe592c3bbde0 - 6 * 8 - 0x28
crash> rd FFFFBE592C3BBD88
ffffbe592c3bbd88:  ffff9b6e3a998800                    ...:n...
crash>  struct nfsd4_open.op_file ffff9b6e3a998800
  op_file = 0xffff9baf3ea1b4b8,
  • Checking the memory details
crash> kmem 0xffff9baf3ea1b4b8
      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffe4effdfa86c0 7f7ea1b000                0        0  1 57ffffc0000800 reserved
  • It seems overputting the nfs4_file or a use after free case

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments