RHEL7.1: nfsd serving GPFS mounts, soft lockups with all CPUs executing in nfsd4_delegreturn, trying to obtain state_lock
Issue
- soft lockups with all CPUs eventually stuck in a backtrace similar to the following:
[ 685.098475] BUG: soft lockup - CPU#3 stuck for 23s! [nfsd:8121]
[ 685.098493] Modules linked in: nfsv3 nfs fscache mmfs26(OF) mmfslinux(OF) tracedev(OF) bonding iTCO_wdt iTCO_vendor_support dcdbas intel_powerclamp coretemp kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr sb_edac edac_core lpc_ich mfd_core mei_me mei ext4 mbcache jbd2 dm_service_time ipmi_devintf acpi_pad wmi ipmi_si ipmi_msghandler acpi_power_meter shpchp nfsd auth_rpcgss dm_multipath nfs_acl lockd binfmt_misc sunrpc xfs libcrc32c sd_mod sr_mod crc_t10dif cdrom crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper ahci libahci ttm libata drm igb i2c_algo_bit i2c_core qla2xxx ixgbe megaraid_sas mdio ptp scsi_transport_fc pps_core scsi_tgt dca dm_mirror dm_region_hash dm_log dm_mod
[ 685.098513] CPU: 3 PID: 8121 Comm: nfsd Tainted: GF W O-------------- 3.10.0-229.7.2.el7.x86_64 #1
[ 685.098514] Hardware name: Dell Inc. PowerEdge R620/0VV3F2, BIOS 2.4.3 07/09/2014
[ 685.098514] task: ffff881c5d6271c0 ti: ffff881c5cfc4000 task.ti: ffff881c5cfc4000
[ 685.098515] RIP: 0010:[<ffffffff8160b59a>] [<ffffffff8160b59a>] _raw_spin_lock+0x3a/0x50
[ 685.098517] RSP: 0018:ffff881c5cfc7d48 EFLAGS: 00000206
[ 685.098518] RAX: 00000000000035aa RBX: ffff880058ae93e4 RCX: 0000000000005876
[ 685.098519] RDX: 000000000000587a RSI: 000000000000587a RDI: ffffffffa0272710
[ 685.098519] RBP: ffff881c5cfc7d48 R08: ffff881f8d1a7280 R09: ffff881c472d75e4
[ 685.098520] R10: 0000000000000000 R11: 000000000000ffff R12: ffff881c5cfc7cd8
[ 685.098521] R13: ffff881f8fc79888 R14: ffff881f8d1a7200 R15: ffff881f8fc79800
[ 685.098522] FS: 0000000000000000(0000) GS:ffff881fbf060000(0000) knlGS:0000000000000000
[ 685.098522] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 685.098523] CR2: 00007fa42b56f40c CR3: 0000001f8e537000 CR4: 00000000001407e0
[ 685.098524] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 685.098525] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 685.098525] Stack:
[ 685.098526] ffff881c5cfc7d88 ffffffffa025d26c ffff881c5f25db30 00000000ab960103
[ 685.098527] ffff880058aea000 ffff881c5cc6e000 ffff880058aea068 ffff880058aea1a8
[ 685.098528] ffff881c5cfc7dd8 ffffffffa024b257 ffff880058ae93c0 0000000000000180
[ 685.098530] Call Trace:
[ 685.098534] [<ffffffffa025d26c>] nfsd4_delegreturn+0x10c/0x140 [nfsd]
[ 685.098538] [<ffffffffa024b257>] nfsd4_proc_compound+0x4d7/0x7f0 [nfsd]
[ 685.098541] [<ffffffffa0236e1b>] nfsd_dispatch+0xbb/0x200 [nfsd]
[ 685.098547] [<ffffffffa01fcb33>] svc_process_common+0x453/0x6f0 [sunrpc]
[ 685.098552] [<ffffffffa01fced3>] svc_process+0x103/0x170 [sunrpc]
[ 685.098555] [<ffffffffa02367a7>] nfsd+0xe7/0x150 [nfsd]
[ 685.098558] [<ffffffffa02366c0>] ? nfsd_destroy+0x80/0x80 [nfsd]
[ 685.098560] [<ffffffff8109726f>] kthread+0xcf/0xe0
[ 685.098561] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[ 685.098563] [<ffffffff81614158>] ret_from_fork+0x58/0x90
[ 685.098564] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[ 685.098565] Code: 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 f3 90 83 e8 01 74 0a 0f b7 0f <66> 39 ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb da 66 0f 1f 44 00
Environment
- Red Hat Enterprise Linux 7.1 (NFS server)
- seen on kernel 3.10.0-229.7.2.el7
- nfsd exporting IBM GPFS mount points
- 3rd party modules present
crash> mod -t
NAME TAINTS
tracedev FO
mmfslinux FO
mmfs26 FO
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.