Translated message

A translation of this page exists in English.

rpciod における NFS クライアントのカーネルパニック: kernel BUG at net/sunrpc/sched.c:616, RIP: __rpc_execute + 0x278

Solution Verified - Updated -

Issue

  • __rpc_execute で 、キューにある非同期タスクが BUG_ON(RPC_IS_QUEUED(task)); になったため、NFS クライアントカーネルがクラッシュします。
kernel BUG at net/sunrpc/sched.c:616!
invalid opcode:0000 [#1] SMP 
last sysfs file:/sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
CPU 8 
Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss pcc_cpufreq sunrpc power_meter hpilo
hpwdt igb mlx4_ib(U) mlx4_en(U) raid0 mlx4_core(U) sg microcode serio_raw iTCO_wdt
iTCO_vendor_support ioatdma dca shpchp ext4 mbcache jbd2 raid1 sd_mod crc_t10dif mpt2sas
scsi_transport_sas raid_class ahci dm_mirror dm_region_hash dm_log dm_mod 
[last unloaded: scsi_wait_scan]

Pid:2256, comm: rpciod/8 Not tainted 2.6.32-220.el6.x86_64 #1 HP ProLiant SL250s Gen8/
RIP:0010:[<ffffffffa01fe458>]  [<ffffffffa01fe458>] __rpc_execute+0x278/0x2a0 [sunrpc]
...
Process rpciod/8 (pid:2256, threadinfo ffff882016152000, task ffff8820162e80c0)
...
Call Trace:
 [<ffffffffa01fe4d0>] ? rpc_async_schedule+0x0/0x20 [sunrpc]
 [<ffffffffa01fe4e5>] rpc_async_schedule+0x15/0x20 [sunrpc]
 [<ffffffff8108b2b0>] worker_thread+0x170/0x2a0
 [<ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108b140>] ? worker_thread+0x0/0x2a0
 [<ffffffff81090886>] kthread+0x96/0xa0
 [<ffffffff8100c14a>] child_rip+0xa/0x20
Code: db df 2e e1 f6 05 e0 26 02 00 40 0f 84 48 fe ff ff 0f b7 b3 d4 00 00 00 48 c7 
c7 94 39 21 a0 31 c0 e8 b9 df 2e e1 e9 2e fe ff ff <0f> 0b eb fe 0f b7 b7 d4 00 00 00 
31 c0 48 c7 c7 60 63 21 a0 e8 
RIP  [<ffffffffa01fe458>] __rpc_execute+0x278/0x2a0 [sunrpc]
  • 2 つ目のパニックは、rpciod スレッドのパニックと類似していますが、場所が異なります。次の BUG_ON(get_wq_data(work) != cwq); である kernel BUG at kernel/workqueue.c に到達します。また、oops の前に、xprt_reserve_xprt から呼び出された __list_add によって発生したリスト破損に関する警告が表示されます。コードの場所に基づいて、rpc_xprt の'sending' または 'resend' キューにリスト破損のフラグがたちます
------------[ cut here ]------------
WARNING: at lib/list_debug.c:30 __list_add+0x8f/0xa0() (Not tainted)
Hardware name:ProLiant SL250s Gen8
list_add corruption. prev->next should be next (ffff88201900e998), but was ffffe8efec8232c1.(prev=ffff881bc0b5c
150).
Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss pcc_cpufreq sunrpc power_meter hpilo hpwdt igb mlx4_ib(
U) mlx4_en(U) raid0 mlx4_core(U) sg microcode serio_raw iTCO_wdt iTCO_vendor_support ioatdma dca shpchp ext4 mbc
ache jbd2 raid1 sd_mod crc_t10dif mpt2sas scsi_transport_sas raid_class ahci dm_mirror dm_region_hash dm_log dm_
mod [last unloaded: scsi_wait_scan]
Pid:16460, comm:10.2.8.2-m Not tainted 2.6.32-220.el6.x86_64 #1
Call Trace:
 [<ffffffff81069b77>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff81069c66>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff8127b86f>] ?__list_add+0x8f/0xa0
 [<ffffffffa01fe7db>] ? rpc_sleep_on+0x10b/0x2f0 [sunrpc]
 [<ffffffffa01f8cf3>] ? xprt_reserve_xprt+0x83/0x120 [sunrpc]
 [<ffffffffa01f8173>] ? xprt_prepare_transmit+0x63/0xb0 [sunrpc]
 [<ffffffffa01f5ab7>] ? call_transmit+0x47/0x2c0 [sunrpc]
 [<ffffffffa01fe23e>] ?__rpc_execute+0x5e/0x2a0 [sunrpc]
 [<ffffffffa01fe4c3>] ? rpc_execute+0x43/0x50 [sunrpc]
 [<ffffffffa01f6cc5>] ? rpc_run_task+0x75/0x90 [sunrpc]
 [<ffffffffa01f6de2>] ? rpc_call_sync+0x42/0x70 [sunrpc]
 [<ffffffffa0298d2d>] ? nfs4_proc_renew+0x4d/0xa0 [nfs]
 [<ffffffffa02a935e>] ? nfs4_run_state_manager+0x3fe/0x5e0 [nfs]
 [<ffffffffa02a8f60>] ? nfs4_run_state_manager+0x0/0x5e0 [nfs]
 [<ffffffff81090886>] ? kthread+0x96/0xa0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff810907f0>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20
---[ end------------[ cut here ]------------

kernel BUG at kernel/workqueue.c:287!
invalid opcode:0000 [#1] SMP 
last sysfs file:/sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
CPU 1 
Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss pcc_cpufreq sunrpc power_meter hpilo hpwdt igb mlx4_ib(
U) mlx4_en(U) raid0 mlx4_core(U) sg microcode serio_raw iTCO_wdt iTCO_vendor_support ioatdma dca shpchp ext4 mbc
ache jbd2 raid1 sd_mod crc_t10dif mpt2sas scsi_transport_sas raid_class ahci dm_mirror dm_region_hash dm_log dm_
mod [last unloaded: scsi_wait_scan]

Pid:2338, comm: rpciod/1 Tainted:G        W  ----------------   2.6.32-220.el6.x86_64 #1 HP ProLiant SL250s Ge
n8/
RIP:0010:[<ffffffff8108b38d>]  [<ffffffff8108b38d>] worker_thread+0x24d/0x2a0
RSP:0018:ffff8810041f5e40  EFLAGS:00010216
RAX: ffff881bc0b5c158 RBX: ffffe8efec8232c0 RCX: ffffe8efec8232c8
RDX: ffff881bc0b5c150 RSI: ffff8800966aa118 RDI: ffffe8efec8232c0
RBP: ffff8810041f5ee0 R08:0000000000000000 R09:0000000000000000
R10:0000000000000001 R11:0000000000000000 R12: ffff88096ba7cd48
R13: ffffffffa01fe4d0 R14: ffff8810041f5fd8 R15: ffffe8efec8232c8
FS:0000000000000000(0000) GS:ffff880065620000(0000) knlGS:0000000000000000
CS:0010 DS:0018 ES:0018 CR0:000000008005003b
CR2:000000361ea03088 CR3:0000002016148000 CR4:00000000000406e0
DR0:0000000000000000 DR1:0000000000000000 DR2:0000000000000000
DR3:0000000000000000 DR6:00000000ffff0ff0 DR7:0000000000000400
Process rpciod/1 (pid:2338, threadinfo ffff8810041f4000, task ffff8810182fa080)
Stack:
 0000000000000000 0000000000000000 ffff8810041f5e60 ffff8810182fa6f8
<0> ffff8810182fa080 ffff8810182fa080 ffff8810182fa080 ffffe8efec8232d8
<0> 0000000000000000 ffff8810182fa080 ffffffff81090bf0 ffff8810041f5e98
Call Trace:
 [<ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108b140>] ? worker_thread+0x0/0x2a0
 [<ffffffff81090886>] kthread+0x96/0xa0
 [<ffffffff8100c14a>] child_rip+0xa/0x20
 [<ffffffff810907f0>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20
Code:48 89 95 70 ff ff ff 4c 89 e7 ff d1 48 8b 85 68 ff ff ff 48 8b 95 70 ff ff ff 48 83 c0 08 48 8b 08 48 85 c
9 75 d0 e9 df fe ff ff <0f> 0b eb fe 48 8b 45 80 48 8b b5 78 ff ff ff 48 c7 c7 58 0e 79 
RIP  [<ffffffff8108b38d>] worker_thread+0x24d/0x2a0
 RSP <ffff8810041f5e40>

Environment

  • Red Hat Enterprise Linux 6
    • kernel-2.6.32-358.14.1.el6 以前のカーネル
    • 少なくても 2.6.32-220.el6、2.6.32-279.9.1.el6、および 2.6.32-358.6.1.el6 で報告されています。
  • MRG 2.x
    • カーネル 2.6.33.9-rt31.66.el6rt で報告されています。
  • NFS クライアント
    • しばしば、NFSv4 が使用されます。

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content