softlockup でシステムがハングアップする
Issue
- RHEL 5 システムがしばしば動かなくなり、復元するにはリセットが必要になります。/var/log/messages に "soft lockup" メッセージが記録されます。
kernel:Pid:25217, comm: bpbkar Not tainted 2.6.18-308.11.1.el5 #1
kernel:RIP:0010:[<ffffffff80016278>] [<ffffffff80016278>] __bitmap_empty+0xf/0x62
kernel:RSP:0018:ffff810442e35d48 EFLAGS:00000212
kernel:RAX:0000000000000003 RBX: ffff81062b060200 RCX:00000000000000ff
kernel:RDX:000000000000003f RSI:00000000000000ff RDI: ffff81062b060200
kernel:RBP:0000000000000000 R08:0000000000000004 R09: ffff81062b060200
kernel:R10:0000000000000296 R11:0000000000000000 R12: ffffffff8002b44d
kernel:R13: ffff810442e35cb8 R14:0000000024503c78 R15: ffff810624569660
kernel:FS:00002b1ebbf2a080(0000) GS:ffff81062b1e0440(0000) knlGS:00000000f759a9e0
kernel:CS:0010 DS:002b ES:002b CR0:000000008005003b
kernel:CR2:00000000080ef678 CR3:0000000000201000 CR4:00000000000006e0
kernel:
kernel:Call Trace:
kernel:[<ffffffff80023185>] flush_tlb_others+0x9a/0xbd
kernel:[<ffffffff80077afe>] flush_tlb_mm+0xcc/0xd7
kernel:[<ffffffff80007c8b>] unmap_vmas+0x5b4/0x909
kernel:[<ffffffff80039e46>] exit_mmap+0x87/0x104
kernel:[<ffffffff8003bfd5>] mmput+0x30/0x82
kernel:[<ffffffff800158b4>] do_exit+0x2e7/0x931
kernel:[<ffffffff80048e4a>] cpuset_exit+0x0/0x88
kernel:[<ffffffff80061624>] cstar_do_call+0x1b/0x6e
以下のように、ハングアップタスクのタイムアウトが表示されます。
kernel:INFO: task crond:25236 blocked for more than 120 seconds.
kernel:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: crond D ffff81022b061420 0 25236 4399 (NOTLB)
kernel:ffff8100287bbe58 0000000000000082 80c40b0607d1142f f60b14fec3c66db6
kernel:4b493b0b9722722c 0000000000000009 ffff8102225b40c0 ffff81062b0ef080
kernel:000c503001a97ac6 000000000020ea4c ffff8102225b42a8 00000002801add06
kernel:Call Trace:
kernel:[<ffffffff80063171>] wait_for_completion+0x79/0xa2
kernel:[<ffffffff8008ee84>] default_wake_function+0x0/0xe
kernel:[<ffffffff801296e5>] __key_instantiate_and_link+0x8f/0xc5
kernel:[<ffffffff800a1820>] synchronize_rcu+0x30/0x36
kernel:[<ffffffff800a135c>] wakeme_after_rcu+0x0/0x9
kernel:[<ffffffff8012c105>] install_session_keyring+0xc0/0xd3
kernel:[<ffffffff8012c633>] join_session_keyring+0x25/0xcb
kernel:[<ffffffff8012baf0>] keyctl_join_session_keyring+0x2d/0x40
kernel:[<ffffffff8005d28d>] tracesys+0xd5/0xe0
これらは何度も発生し、連鎖しているようです。
Environment
- Red Hat Enterprise Linux (RHEL) 5.8
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.