RHEL7.2: NFS4 server repeated soft lockups due to laundromat_main kworker process stuck in __destroy_client

Solution Verified - Updated -

Issue

Apr  9 18:12:16 nfs-server kernel: BUG: soft lockup - CPU#3 stuck for 23s! [kworker/u64:0:14786]
Apr  9 18:12:16 nfs-server kernel: Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill bnx2i libiscsi nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge ses dm_service_time enclosure mptctl mptbase bnx2fc cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt scsi_transport_iscsi 8021q garp stp mrp llc bonding intel_powerclamp coretemp intel_rapl kvm_intel kvm iTCO_wdt crc32_pclmul ghash_clmulni_intel iTCO_vendor_support aesni_intel ipmi_devintf lrw gf128mul glue_helper ablk_helper cryptd sg hpwdt hpilo sb_edac i2c_i801 pcspkr edac_core ioatdma shpchp lpc_ich mfd_core dca ipmi_si wmi ipmi_msghandler pcc_cpufreq acpi_power_meter binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables
Apr  9 18:12:16 nfs-server kernel: xfs sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect crct10dif_pclmul crct10dif_common sysimgblt crc32c_intel i2c_algo_bit serio_raw drm_kms_helper ttm bnx2x drm mdio i2c_core ptp pps_core hpsa libcrc32c dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_dbb9ab46bcd5709557faf4d2945b181f_13752]
Apr  9 18:12:16 nfs-server kernel: CPU: 3 PID: 14786 Comm: kworker/u64:0 Tainted: G           OE  ------------   3.10.0-327.10.1.el7.x86_64 #1
Apr  9 18:12:16 nfs-server kernel: Hardware name: HP ProLiant BL460c Gen9, BIOS I36 05/06/2015
Apr  9 18:12:16 nfs-server kernel: Workqueue: nfsd4 laundromat_main [nfsd]
Apr  9 18:12:16 nfs-server kernel: task: ffff8810514f5c00 ti: ffff880924968000 task.ti: ffff880924968000
Apr  9 18:12:16 nfs-server kernel: RIP: 0010:[<ffffffff8163cb0b>]  [<ffffffff8163cb0b>] _raw_spin_unlock_irqrestore+0x1b/0x40
Apr  9 18:12:16 nfs-server kernel: RSP: 0018:ffff88092496bcc8  EFLAGS: 00000246
Apr  9 18:12:16 nfs-server kernel: RAX: ffffffffa046b970 RBX: ffffffff810b0d54 RCX: ffffffffa046b988
Apr  9 18:12:16 nfs-server kernel: RDX: ffffffffa046b988 RSI: 0000000000000246 RDI: 0000000000000246
Apr  9 18:12:16 nfs-server kernel: RBP: ffff88092496bcd0 R08: 0000000000000000 R09: ffff88085fc77540
Apr  9 18:12:16 nfs-server kernel: R10: ffffea00054dd240 R11: ffffffffa045f6bb R12: ffffffff81983f60
Apr  9 18:12:16 nfs-server kernel: R13: 0000000000000246 R14: 0000000000000003 R15: 0000000000000246
Apr  9 18:12:16 nfs-server kernel: FS:  0000000000000000(0000) GS:ffff88085fc60000(0000) knlGS:0000000000000000
Apr  9 18:12:16 nfs-server kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr  9 18:12:16 nfs-server kernel: CR2: 00007fb61f554140 CR3: 000000000194a000 CR4: 00000000001407e0
Apr  9 18:12:16 nfs-server kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr  9 18:12:16 nfs-server kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr  9 18:12:16 nfs-server kernel: Stack:
Apr  9 18:12:16 nfs-server kernel: ffffffffa046b980 ffff88092496bd08 ffffffff810b0d54 ffff88092496bcf8
Apr  9 18:12:16 nfs-server kernel: ffffffffa04538d5 ffffffffa0453595 ffff88092496bd48 ffff8810528af888
Apr  9 18:12:16 nfs-server kernel: ffff88092496bd38 ffffffffa0453595 ffff88092496bd48 ffff8810528af800
Apr  9 18:12:16 nfs-server kernel: Call Trace:
Apr  9 18:12:16 nfs-server kernel: [<ffffffff810b0d54>] __wake_up+0x44/0x50
Apr  9 18:12:16 nfs-server kernel: [<ffffffffa04538d5>] ? __destroy_client+0x135/0x180 [nfsd]
Apr  9 18:12:16 nfs-server kernel: [<ffffffffa0453595>] ? nfs4_put_stid+0x75/0x80 [nfsd]
Apr  9 18:12:16 nfs-server kernel: [<ffffffffa0453595>] nfs4_put_stid+0x75/0x80 [nfsd]
Apr  9 18:12:16 nfs-server kernel: [<ffffffffa045388f>] __destroy_client+0xef/0x180 [nfsd]
Apr  9 18:12:16 nfs-server kernel: [<ffffffffa0453942>] expire_client+0x22/0x30 [nfsd]
Apr  9 18:12:16 nfs-server kernel: [<ffffffffa0457506>] laundromat_main+0x166/0x4e0 [nfsd]
Apr  9 18:12:16 nfs-server kernel: [<ffffffff8109d5db>] process_one_work+0x17b/0x470
Apr  9 18:12:16 nfs-server kernel: [<ffffffff8109e3ab>] worker_thread+0x11b/0x400
...
Apr  9 18:12:44 nfs-server kernel: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/u64:0:14786]
Apr  9 18:12:44 nfs-server kernel: Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill bnx2i libiscsi nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache iptable_filter fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat ext4 mbcache jbd2 bridge ses dm_service_time enclosure mptctl mptbase bnx2fc cnic uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt scsi_transport_iscsi 8021q garp stp mrp llc bonding intel_powerclamp coretemp intel_rapl kvm_intel kvm iTCO_wdt crc32_pclmul ghash_clmulni_intel iTCO_vendor_support aesni_intel ipmi_devintf lrw gf128mul glue_helper ablk_helper cryptd sg hpwdt hpilo sb_edac i2c_i801 pcspkr edac_core ioatdma shpchp lpc_ich mfd_core dca ipmi_si wmi ipmi_msghandler pcc_cpufreq acpi_power_meter binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables
Apr  9 18:12:44 nfs-server kernel: xfs sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect crct10dif_pclmul crct10dif_common sysimgblt crc32c_intel i2c_algo_bit serio_raw drm_kms_helper ttm bnx2x drm mdio i2c_core ptp pps_core hpsa libcrc32c dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_dbb9ab46bcd5709557faf4d2945b181f_13752]
Apr  9 18:12:44 nfs-server kernel: CPU: 3 PID: 14786 Comm: kworker/u64:0 Tainted: G           OEL ------------   3.10.0-327.10.1.el7.x86_64 #1
Apr  9 18:12:44 nfs-server kernel: Hardware name: HP ProLiant BL460c Gen9, BIOS I36 05/06/2015
Apr  9 18:12:44 nfs-server kernel: Workqueue: nfsd4 laundromat_main [nfsd]
Apr  9 18:12:44 nfs-server kernel: task: ffff8810514f5c00 ti: ffff880924968000 task.ti: ffff880924968000
Apr  9 18:12:44 nfs-server kernel: RIP: 0010:[<ffffffff812f2a0c>]  [<ffffffff812f2a0c>] _atomic_dec_and_lock+0x1c/0x70
Apr  9 18:12:44 nfs-server kernel: RSP: 0018:ffff88092496bcf8  EFLAGS: 00000246
Apr  9 18:12:44 nfs-server kernel: RAX: 000000002496bd48 RBX: ffffffffa046b988 RCX: 000000002496bd47
Apr  9 18:12:44 nfs-server kernel: RDX: 000000002496bd48 RSI: ffffffffa04538d5 RDI: ffff88092496bcf8
Apr  9 18:12:44 nfs-server kernel: RBP: ffff88092496bd08 R08: ffff88092496bd48 R09: ffff88085fc77540
Apr  9 18:12:44 nfs-server kernel: R10: ffffea00054dd240 R11: ffffffffa045f6bb R12: 0000000000000000
Apr  9 18:12:44 nfs-server kernel: R13: ffff88085fc77540 R14: ffffea00054dd240 R15: ffffffffa045f6bb
Apr  9 18:12:44 nfs-server kernel: FS:  0000000000000000(0000) GS:ffff88085fc60000(0000) knlGS:0000000000000000
Apr  9 18:12:44 nfs-server kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr  9 18:12:44 nfs-server kernel: CR2: 00007fb61f554140 CR3: 000000000194a000 CR4: 00000000001407e0
Apr  9 18:12:44 nfs-server kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr  9 18:12:44 nfs-server kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr  9 18:12:44 nfs-server kernel: Stack:
Apr  9 18:12:44 nfs-server kernel: ffff88092496bd47 ffff88092496bcf8 ffff88092496bd38 ffffffffa045354a
Apr  9 18:12:44 nfs-server kernel: ffff88092496bd48 ffff8810528af800 ffff88092496bcf8 ffff8810528af878
Apr  9 18:12:44 nfs-server kernel: ffff88092496bd80 ffffffffa045388f ffff88092496bd48 ffff88092496bd48
Apr  9 18:12:44 nfs-server kernel: Call Trace:
Apr  9 18:12:44 nfs-server kernel: [<ffffffffa045354a>] nfs4_put_stid+0x2a/0x80 [nfsd]
Apr  9 18:12:44 nfs-server kernel: [<ffffffffa045388f>] __destroy_client+0xef/0x180 [nfsd]
Apr  9 18:12:44 nfs-server kernel: [<ffffffffa0453942>] expire_client+0x22/0x30 [nfsd]
Apr  9 18:12:44 nfs-server kernel: [<ffffffffa0457506>] laundromat_main+0x166/0x4e0 [nfsd]
Apr  9 18:12:44 nfs-server kernel: [<ffffffff8109d5db>] process_one_work+0x17b/0x470
Apr  9 18:12:44 nfs-server kernel: [<ffffffff8109e3ab>] worker_thread+0x11b/0x400

Environment

  • Red Hat Enterprise Linux 7.2 (NFS server)
    • kernel prior to kernel-3.10.0-327.18.2.el7
    • reported on kernel-3.10.0-327.18.2.el7
  • NFS4.1
  • Connected to RHEL6.6 NFS client that rebooted

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content