rpciod における NFS クライアントのカーネルパニック: kernel BUG at net/sunrpc/sched.c:616, RIP: __rpc_execute + 0x278
Issue
__rpc_executeで 、キューにある非同期タスクがBUG_ON(RPC_IS_QUEUED(task));になったため、NFS クライアントカーネルがクラッシュします。
kernel BUG at net/sunrpc/sched.c:616!
invalid opcode:0000 [#1] SMP
last sysfs file:/sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
CPU 8
Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss pcc_cpufreq sunrpc power_meter hpilo
hpwdt igb mlx4_ib(U) mlx4_en(U) raid0 mlx4_core(U) sg microcode serio_raw iTCO_wdt
iTCO_vendor_support ioatdma dca shpchp ext4 mbcache jbd2 raid1 sd_mod crc_t10dif mpt2sas
scsi_transport_sas raid_class ahci dm_mirror dm_region_hash dm_log dm_mod
[last unloaded: scsi_wait_scan]
Pid:2256, comm: rpciod/8 Not tainted 2.6.32-220.el6.x86_64 #1 HP ProLiant SL250s Gen8/
RIP:0010:[<ffffffffa01fe458>] [<ffffffffa01fe458>] __rpc_execute+0x278/0x2a0 [sunrpc]
...
Process rpciod/8 (pid:2256, threadinfo ffff882016152000, task ffff8820162e80c0)
...
Call Trace:
[<ffffffffa01fe4d0>] ? rpc_async_schedule+0x0/0x20 [sunrpc]
[<ffffffffa01fe4e5>] rpc_async_schedule+0x15/0x20 [sunrpc]
[<ffffffff8108b2b0>] worker_thread+0x170/0x2a0
[<ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8108b140>] ? worker_thread+0x0/0x2a0
[<ffffffff81090886>] kthread+0x96/0xa0
[<ffffffff8100c14a>] child_rip+0xa/0x20
Code: db df 2e e1 f6 05 e0 26 02 00 40 0f 84 48 fe ff ff 0f b7 b3 d4 00 00 00 48 c7
c7 94 39 21 a0 31 c0 e8 b9 df 2e e1 e9 2e fe ff ff <0f> 0b eb fe 0f b7 b7 d4 00 00 00
31 c0 48 c7 c7 60 63 21 a0 e8
RIP [<ffffffffa01fe458>] __rpc_execute+0x278/0x2a0 [sunrpc]
- 2 つ目のパニックは、rpciod スレッドのパニックと類似していますが、場所が異なります。次の
BUG_ON(get_wq_data(work) != cwq);であるkernel BUG at kernel/workqueue.cに到達します。また、oops の前に、xprt_reserve_xprtから呼び出された__list_addによって発生したリスト破損に関する警告が表示されます。コードの場所に基づいて、rpc_xprt の'sending' または 'resend' キューにリスト破損のフラグがたちます。
------------[ cut here ]------------
WARNING: at lib/list_debug.c:30 __list_add+0x8f/0xa0() (Not tainted)
Hardware name:ProLiant SL250s Gen8
list_add corruption. prev->next should be next (ffff88201900e998), but was ffffe8efec8232c1.(prev=ffff881bc0b5c
150).
Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss pcc_cpufreq sunrpc power_meter hpilo hpwdt igb mlx4_ib(
U) mlx4_en(U) raid0 mlx4_core(U) sg microcode serio_raw iTCO_wdt iTCO_vendor_support ioatdma dca shpchp ext4 mbc
ache jbd2 raid1 sd_mod crc_t10dif mpt2sas scsi_transport_sas raid_class ahci dm_mirror dm_region_hash dm_log dm_
mod [last unloaded: scsi_wait_scan]
Pid:16460, comm:10.2.8.2-m Not tainted 2.6.32-220.el6.x86_64 #1
Call Trace:
[<ffffffff81069b77>] ? warn_slowpath_common+0x87/0xc0
[<ffffffff81069c66>] ? warn_slowpath_fmt+0x46/0x50
[<ffffffff8127b86f>] ?__list_add+0x8f/0xa0
[<ffffffffa01fe7db>] ? rpc_sleep_on+0x10b/0x2f0 [sunrpc]
[<ffffffffa01f8cf3>] ? xprt_reserve_xprt+0x83/0x120 [sunrpc]
[<ffffffffa01f8173>] ? xprt_prepare_transmit+0x63/0xb0 [sunrpc]
[<ffffffffa01f5ab7>] ? call_transmit+0x47/0x2c0 [sunrpc]
[<ffffffffa01fe23e>] ?__rpc_execute+0x5e/0x2a0 [sunrpc]
[<ffffffffa01fe4c3>] ? rpc_execute+0x43/0x50 [sunrpc]
[<ffffffffa01f6cc5>] ? rpc_run_task+0x75/0x90 [sunrpc]
[<ffffffffa01f6de2>] ? rpc_call_sync+0x42/0x70 [sunrpc]
[<ffffffffa0298d2d>] ? nfs4_proc_renew+0x4d/0xa0 [nfs]
[<ffffffffa02a935e>] ? nfs4_run_state_manager+0x3fe/0x5e0 [nfs]
[<ffffffffa02a8f60>] ? nfs4_run_state_manager+0x0/0x5e0 [nfs]
[<ffffffff81090886>] ? kthread+0x96/0xa0
[<ffffffff8100c14a>] ? child_rip+0xa/0x20
[<ffffffff810907f0>] ? kthread+0x0/0xa0
[<ffffffff8100c140>] ? child_rip+0x0/0x20
---[ end------------[ cut here ]------------
kernel BUG at kernel/workqueue.c:287!
invalid opcode:0000 [#1] SMP
last sysfs file:/sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
CPU 1
Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss pcc_cpufreq sunrpc power_meter hpilo hpwdt igb mlx4_ib(
U) mlx4_en(U) raid0 mlx4_core(U) sg microcode serio_raw iTCO_wdt iTCO_vendor_support ioatdma dca shpchp ext4 mbc
ache jbd2 raid1 sd_mod crc_t10dif mpt2sas scsi_transport_sas raid_class ahci dm_mirror dm_region_hash dm_log dm_
mod [last unloaded: scsi_wait_scan]
Pid:2338, comm: rpciod/1 Tainted:G W ---------------- 2.6.32-220.el6.x86_64 #1 HP ProLiant SL250s Ge
n8/
RIP:0010:[<ffffffff8108b38d>] [<ffffffff8108b38d>] worker_thread+0x24d/0x2a0
RSP:0018:ffff8810041f5e40 EFLAGS:00010216
RAX: ffff881bc0b5c158 RBX: ffffe8efec8232c0 RCX: ffffe8efec8232c8
RDX: ffff881bc0b5c150 RSI: ffff8800966aa118 RDI: ffffe8efec8232c0
RBP: ffff8810041f5ee0 R08:0000000000000000 R09:0000000000000000
R10:0000000000000001 R11:0000000000000000 R12: ffff88096ba7cd48
R13: ffffffffa01fe4d0 R14: ffff8810041f5fd8 R15: ffffe8efec8232c8
FS:0000000000000000(0000) GS:ffff880065620000(0000) knlGS:0000000000000000
CS:0010 DS:0018 ES:0018 CR0:000000008005003b
CR2:000000361ea03088 CR3:0000002016148000 CR4:00000000000406e0
DR0:0000000000000000 DR1:0000000000000000 DR2:0000000000000000
DR3:0000000000000000 DR6:00000000ffff0ff0 DR7:0000000000000400
Process rpciod/1 (pid:2338, threadinfo ffff8810041f4000, task ffff8810182fa080)
Stack:
0000000000000000 0000000000000000 ffff8810041f5e60 ffff8810182fa6f8
<0> ffff8810182fa080 ffff8810182fa080 ffff8810182fa080 ffffe8efec8232d8
<0> 0000000000000000 ffff8810182fa080 ffffffff81090bf0 ffff8810041f5e98
Call Trace:
[<ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8108b140>] ? worker_thread+0x0/0x2a0
[<ffffffff81090886>] kthread+0x96/0xa0
[<ffffffff8100c14a>] child_rip+0xa/0x20
[<ffffffff810907f0>] ? kthread+0x0/0xa0
[<ffffffff8100c140>] ? child_rip+0x0/0x20
Code:48 89 95 70 ff ff ff 4c 89 e7 ff d1 48 8b 85 68 ff ff ff 48 8b 95 70 ff ff ff 48 83 c0 08 48 8b 08 48 85 c
9 75 d0 e9 df fe ff ff <0f> 0b eb fe 48 8b 45 80 48 8b b5 78 ff ff ff 48 c7 c7 58 0e 79
RIP [<ffffffff8108b38d>] worker_thread+0x24d/0x2a0
RSP <ffff8810041f5e40>
Environment
- Red Hat Enterprise Linux 6
- kernel-2.6.32-358.14.1.el6 以前のカーネル
- 少なくても 2.6.32-220.el6、2.6.32-279.9.1.el6、および 2.6.32-358.6.1.el6 で報告されています。
- MRG 2.x
- カーネル 2.6.33.9-rt31.66.el6rt で報告されています。
- NFS クライアント
- しばしば、NFSv4 が使用されます。
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.