NFS server is randomly sending malformed replies to READDIR requests
Issue
- NFS client hangs on
getdents64()
system call - A NFS server is observed to be sending NFSv4 owner groups with incorrect data.
- In NFS Client, Kernel crashes with following logs in some cases:
[132015.011271] BUG: unable to handle kernel paging request at ffff88082bf68000
[132015.046471] IP: [<ffffffff81326696>] memcpy+0x6/0x110
[132015.072112] PGD 1f9e067 PUD 1fa1067 PMD 82cff8063 PTE 800000082bf68161
[132015.102296] Oops: 0003 [#1] SMP
[132015.116789] Modules linked in: loop rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif ipmi_devintf ipmi_si iTCO_wdt pcspkr sb_edac sg ipmi_msghandler hpilo iTCO_vendor_support hpwdt wmi acpi_power_meter edac_core lpc_ich pcc_cpufreq ioatdma nfsd shpchp dca auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_generic pata_acpi ata_piix crct10dif_pclmul crct10dif_common tg3 drm crc32c_intel serio_raw libata hpsa ptp i2c_core scsi_transport_sas pps_core fjes dm_mirror dm_region_hash dm_log dm_mod
[132015.431515] CPU: 6 PID: 54488 Comm: find Not tainted 3.10.0-510.el7.bz1375457.x86_64 #1
[132015.468026] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014
[132015.498181] task: ffff88034f410fb0 ti: ffff8801ae180000 task.ti: ffff8801ae180000
[132015.532299] RIP: 0010:[<ffffffff81326696>] [<ffffffff81326696>] memcpy+0x6/0x110
[132015.569035] RSP: 0018:ffff8801ae183ac0 EFLAGS: 00010282
[132015.595289] RAX: ffff88082bf4ae42 RBX: ffff8801ae183c28 RCX: fffffffffffe2e41
[132015.628541] RDX: ffffffffffffffff RSI: ffff8803879e2196 RDI: ffff88082bf68000
[132015.661145] RBP: ffff8801ae183b48 R08: 0000000000000000 R09: 0000000000000000
[132015.693802] R10: 0000000000000012 R11: ffff8803879c4eac R12: ffff88042b4fb900
[132015.726315] R13: ffff88082bf4ae40 R14: 00000000ffffffff R15: ffff8801ae183b6c
[132015.758918] FS: 00007fd33c2a0800(0000) GS:ffff88042f780000(0000) knlGS:0000000000000000
[132015.795712] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[132015.822045] CR2: ffff88082bf68000 CR3: 00000005e56c8000 CR4: 00000000001407e0
[132015.854577] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[132015.887146] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[132015.919704] Stack:
[132015.929121] ffffffffa056c48d ffff8801ae183ad8 ffffffffa05756f3 ffffffffffffffff
[132015.962946] ffff88042d64e000 0000000000000000 0000000000000000 ffff8801ae183c28
[132015.996626] ffff8801ae183b64 ffff8801ae183b68 ffff88032d98a000 00000000b3363ae6
[132016.030992] Call Trace:
[132016.042864] [<ffffffffa056c48d>] ? decode_getfattr_attrs+0x2cd/0x1510 [nfsv4]
[132016.078474] [<ffffffffa05756f3>] ? nfs4_have_delegation+0x13/0x20 [nfsv4]
[132016.113650] [<ffffffffa056ffb7>] nfs4_decode_dirent+0x137/0x1c0 [nfsv4]
[132016.145652] [<ffffffffa045e945>] nfs_readdir_page_filler+0x135/0x5b0 [nfs]
[132016.177613] [<ffffffffa045efcd>] nfs_readdir_xdr_to_array+0x20d/0x3b0 [nfs]
[132016.209323] [<ffffffff8118acc6>] ? __alloc_pages_nodemask+0x176/0x420
[132016.239264] [<ffffffffa045f170>] ? nfs_readdir_xdr_to_array+0x3b0/0x3b0 [nfs]
[132016.272644] [<ffffffffa045f192>] nfs_readdir_filler+0x22/0x90 [nfs]
[132016.301694] [<ffffffff8118105f>] do_read_cache_page+0x7f/0x190
[132016.328718] [<ffffffff81212270>] ? fillonedir+0xe0/0xe0
[132016.353029] [<ffffffff811811ac>] read_cache_page+0x1c/0x30
[132016.378577] [<ffffffffa045f3db>] nfs_readdir+0x1db/0x6b0 [nfs]
[132016.405766] [<ffffffffa056fe80>] ? nfs4_xdr_dec_layoutget+0x270/0x270 [nfsv4]
[132016.438268] [<ffffffff81212270>] ? fillonedir+0xe0/0xe0
[132016.462088] [<ffffffff81212160>] vfs_readdir+0xb0/0xe0
[132016.485961] [<ffffffff81212585>] SyS_getdents+0x95/0x120
[132016.510752] [<ffffffff816962c9>] system_call_fastpath+0x16/0x1b
[132016.538324] Code: 43 60 48 2b 43 50 88 43 4e 5b 5d c3 66 0f 1f 84 00 00 00 00 00 e8 7b fc ff ff eb e2 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 c3 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b
[132016.629326] RIP [<ffffffff81326696>] memcpy+0x6/0x110
[132016.653592] RSP <ffff8801ae183ac0>
[132016.669725] CR2: ffff88082bf68000
NOTE: The panic stack could be different, depending on the task that triggers the panic (which may not always be an nfs related one). The stack seen above is a strong indication of this problem. More details can be found in the "Root Cause" section.
Environment
- Red Hat Enterprise Linux 7 or 8 as the NFS server
- Red Hat Enterprise Linux as a NFS client
- NFS clients from other vendors
- NFSv4.0
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.