RHEL6.3: nfsiod BUG: soft lockup - CPU#0 stuck for 67s! - fusionIO in use

Solution Unverified - Updated -

Issue

  • repeated soft lockup messages seen in the logs, indicating nfsiod is stuck for 67 seconds, at various places
Jul 13 10:55:02 localhost kernel: BUG: soft lockup - CPU#0 stuck for 67s! [nfsiod:3965]
Jul 13 10:55:02 localhost kernel: Modules linked in: bridge stp llc fuse autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipv6 uinput power_meter iomemory_vsl(P)(U) sg netxen_nic
 microcode serio_raw iTCO_wdt iTCO_vendor_support hpwdt hpilo i7core_edac edac_core shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix hpsa radeon tt
m drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Jul 13 10:55:02 localhost kernel: CPU 0 
 microcode serio_raw iTCO_wdt iTCO_vendor_support hpwdt hpilo i7core_edac edac_core shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix hpsa radeon tt
m drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Jul 13 10:55:02 localhost kernel: 
Jul 13 10:55:02 localhost kernel: Pid: 3965, comm: nfsiod Tainted: P           ---------------    2.6.32-279.el6.x86_64 #1 HP ProLiant DL980 G7
Jul 13 10:55:02 localhost kernel: RIP: 0010:[<ffffffff81500112>]  [<ffffffff81500112>] _spin_lock+0x12/0x30
Jul 13 10:55:02 localhost kernel: RSP: 0018:ffff885fb1911d50  EFLAGS: 00000202
Jul 13 10:55:02 localhost kernel: RAX: 000000007c547c54 RBX: ffff885fb1911d50 RCX: 0000000000004000
Jul 13 10:55:02 localhost kernel: RDX: ffff88621f062610 RSI: 0000000000000000 RDI: ffff881fb3f403b8
Jul 13 10:55:02 localhost kernel: RBP: ffffffff8100bc0e R08: 0080000000000000 R09: 0400000000000000
Jul 13 10:55:02 localhost kernel: R10: 00000000ffffffff R11: ffff88fb5554ad58 R12: ffff885fb1911d00
Jul 13 10:55:02 localhost kernel: R13: ffff881fb20bccf8 R14: ffffffff81faff00 R15: 0000000000000000
Jul 13 10:55:02 localhost kernel: FS:  0000000000000000(0000) GS:ffff880118800000(0000) knlGS:0000000000000000
Jul 13 10:55:02 localhost kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jul 13 10:55:02 localhost kernel: CR2: 00007f7ad09580a0 CR3: 0000000001a85000 CR4: 00000000000006f0
Jul 13 10:55:02 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 13 10:55:02 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jul 13 10:55:02 localhost kernel: Process nfsiod (pid: 3965, threadinfo ffff885fb1910000, task ffff885fb1eb0040)
Jul 13 10:55:02 localhost kernel: Stack:
Jul 13 10:55:02 localhost kernel: ffff885fb1911d90 ffffffffa0503d19 ffff885fb1911d90 ffff8885a27648c0
Jul 13 10:55:02 localhost kernel: <d> ffff88621f062440 ffffea01c6c62048 0000000000008000 ffff88621f062600
Jul 13 10:55:02 localhost kernel: <d> ffff885fb1911de0 ffffffffa0504a1b 0000000000000010 ffff88621f062610
Jul 13 10:55:02 localhost kernel: Call Trace:
Jul 13 10:55:02 localhost kernel: [<ffffffffa0503d19>] ? nfs_mark_request_commit+0x49/0x150 [nfs]
Jul 13 10:55:02 localhost kernel: [<ffffffffa0504a1b>] ? nfs_writeback_release_full+0xeb/0x1f0 [nfs]
Jul 13 10:55:02 localhost kernel: [<ffffffffa0468967>] ? rpc_release_calldata+0x17/0x20 [sunrpc]
Jul 13 10:55:02 localhost kernel: [<ffffffffa046a490>] ? rpc_free_task+0x50/0x80 [sunrpc]
Jul 13 10:55:02 localhost kernel: [<ffffffffa046a5a0>] ? rpc_async_release+0x0/0x20 [sunrpc]
Jul 13 10:55:02 localhost kernel: [<ffffffffa046a5b5>] ? rpc_async_release+0x15/0x20 [sunrpc]
Jul 13 10:55:02 localhost kernel: [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0
Jul 13 10:55:02 localhost kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
Jul 13 10:55:02 localhost kernel: [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
Jul 13 10:55:02 localhost kernel: [<ffffffff81091d66>] ? kthread+0x96/0xa0
Jul 13 10:55:02 localhost kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
Jul 13 10:55:02 localhost kernel: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
Jul 13 10:55:02 localhost kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
Jul 13 10:55:02 localhost kernel: Code: 00 00 fa 66 0f 1f 44 00 00 f0 81 2f 00 00 00 01 74 05 e8 e2 e3 d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 <0f> b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 eb f5 83 3f 00 75

Environment

  • Red Hat Enterprise Linux 6
    • NFSv3 client
    • 2.6.32-279.el6
    • Fusion IO direct cache module, iomemory_vsl(P)(U)
fio-common-3.1.1.172-1.0.el6.x86_64                         Wed 11 Jul 2012 03:23:21 PM CDT
fio-firmware-107053-1.0.noarch                              Tue 10 Jul 2012 10:06:42 AM CDT
fio-firmware-ioaccelerator-107004-1.0.noarch                Wed 11 Jul 2012 03:23:26 PM CDT
fio-remote-util-3.1.0.63-1.0.noarch                         Wed 11 Jul 2012 03:23:25 PM CDT
fio-smis-3.1.0.63-1.0.x86_64                                Wed 11 Jul 2012 03:23:24 PM CDT
fio-snmp-agentx-3.1.0.63-1.0.x86_64                         Wed 11 Jul 2012 03:23:22 PM CDT
fio-snmp-mib-hp-3.1.0.63-1.0.noarch                         Wed 11 Jul 2012 03:23:23 PM CDT
fio-sysvinit-3.1.1.172-1.0.el6.x86_64                       Wed 11 Jul 2012 03:23:21 PM CDT
fio-util-3.1.1.172-1.0.el6.x86_64                           Wed 11 Jul 2012 03:23:21 PM CDT
libfio-dev-3.1.0.63-1.0.x86_64                              Wed 11 Jul 2012 03:29:05 PM CDT
libfio-doc-3.1.0.63-1.0.noarch                              Wed 11 Jul 2012 03:29:04 PM CDT

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content