System hung with many tasks blocking on "synchronize_rcu"
Issue
-
System hung with many tasks blocking on "synchronize_rcu"
crash> sys KERNEL: /cores/20110808233212/work/vmlinux DUMPFILE: /cores/20110808233212/work/vmcore [PARTIAL DUMP] CPUS: 8 DATE: Sun Aug 7 18:12:04 2011 UPTIME: 08:37:32 LOAD AVERAGE: 99.02, 97.96, 90.16 <------- High load TASKS: 371 NODENAME: hostname RELEASE: 2.6.18-194.11.4.el5 VERSION: #1 SMP Fri Sep 17 04:57:05 EDT 2010 MACHINE: x86_64 (2500 Mhz) MEMORY: 31.5 GB PANIC: "" -
There are many crond tasks blocked with the following call trace :
[...] INFO: task crond:31781 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. crond D ffff81000900caa0 0 31781 3991 31782 31780 (NOTLB) ffff8107b2f4be58 0000000000000086 ff610134cd92d920 a930c1e2292cdfc7 170fc99bb453d21d 0000000000000005 ffff8107b338c820 ffff81082ff18100 0000159af7f5461e 00000000003e2472 ffff8107b338ca08 00000001801a6bea Call Trace: [<ffffffff80063167>] wait_for_completion+0x79/0xa2 [<ffffffff8008cf9d>] default_wake_function+0x0/0xe [<ffffffff80123954>] __key_instantiate_and_link+0x8f/0xc5 [<ffffffff8009ed3d>] synchronize_rcu+0x30/0x36 [<ffffffff8009e879>] wakeme_after_rcu+0x0/0x9 [<ffffffff801262f0>] install_session_keyring+0xc0/0xd3 [<ffffffff80003138>] level3_kernel_pgt+0x138/0x1000 [<ffffffff8012681e>] join_session_keyring+0x25/0xcb [<ffffffff80125cdb>] keyctl_join_session_keyring+0x2d/0x40 [<ffffffff8005d116>] system_call+0x7e/0x83 INFO: task crond:31782 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. crond D ffff81000901d7a0 0 31782 3991 31781 (NOTLB) ffff8107b2e99e58 0000000000000086 b453d21da930c1e2 0e43583d170fc99b 73137ef065dad4fe 0000000000000005 ffff8107b338c0c0 ffff81082fe1b100 0000159af7f95dfe 000000000044a540 ffff8107b338c2a8 00000003801a6bea Call Trace: [<ffffffff80063167>] wait_for_completion+0x79/0xa2 [<ffffffff8008cf9d>] default_wake_function+0x0/0xe [<ffffffff80123954>] __key_instantiate_and_link+0x8f/0xc5 [<ffffffff8009ed3d>] synchronize_rcu+0x30/0x36 [<ffffffff8009e879>] wakeme_after_rcu+0x0/0x9 [<ffffffff801262f0>] install_session_keyring+0xc0/0xd3 [<ffffffff80003238>] level3_kernel_pgt+0x238/0x1000 [<ffffffff8012681e>] join_session_keyring+0x25/0xcb [<ffffffff80125cdb>] keyctl_join_session_keyring+0x2d/0x40 [<ffffffff8005d116>] system_call+0x7e/0x83 [...] -
Panic occurred due to NMI :
NMI Watchdog detected LOCKUP on CPU 6 CPU 6 Modules linked in: mptctl sg ipmi_devintf ipmi_si ipmi_msghandler autofs4 lockd sunrpc bonding ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport tg3 bnx2 shpchp pcspkr i5000_edac hpilo serio_raw edac_mc dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod mptspi mptscsih scsi_transport_spi mptbase cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Not tainted 2.6.18-194.11.4.el5 #1 RIP: 0010:[<ffffffff80057082>] [<ffffffff80057082>] mwait_idle+0x36/0x4a RSP: 0018:ffff81082fef5ef0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffffffff8005704c RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8030a718 RBP: 0000000000000006 R08: ffff81082fef4000 R09: 000000000000003a R10: ffff81011cb74038 R11: ffff8107e15cfb58 R12: 00000000000000ff R13: ffffffff803d2580 R14: 0000000000000600 R15: ffffffff803f4320 FS: 0000000000000000(0000) GS:ffff81082feabb40(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000018df49fc CR3: 00000007e4584000 CR4: 00000000000006e0 Process swapper (pid: 0, threadinfo ffff81082fef4000, task ffff81082feaf080) Stack: ffffffff8004923a 00000000000000c0 ffffffff8007796b ffffffff803f2340 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffff8004923a>] cpu_idle+0x95/0xb8 [<ffffffff8007796b>] start_secondary+0x498/0x4a7 Code: 65 48 8b 04 25 10 00 00 00 8b 80 38 e0 ff ff a8 08 74 ba c3 [....] -
Memory usage is normal, and log of crond tasks in UN state :
crash> ps | grep UN 684 3991 7 ffff8107b0337040 UN 0.0 88296 3052 crond 685 3991 3 ffff8107b071b7e0 UN 0.0 88296 3052 crond 686 3991 1 ffff8107b06c2820 UN 0.0 88296 3052 crond 687 3991 0 ffff8107b0112860 UN 0.0 88296 3052 crond 688 3991 4 ffff8107afd1f7a0 UN 0.0 88296 3052 crond 1355 3991 0 ffff8107aefe9080 UN 0.0 88296 3052 crond 1488 3991 4 ffff8107ae96c080 UN 0.0 88296 3052 crond crash> ps | grep UN | wc -l 99 They are spawned by 3991 : crash> bt 3991 PID: 3991 TASK: ffff81082e7010c0 CPU: 7 COMMAND: "crond" #0 [ffff81081e87bde8] schedule at ffffffff80062f96 #1 [ffff81081e87bec0] do_nanosleep at ffffffff80063cfd #2 [ffff81081e87bed0] hrtimer_nanosleep at ffffffff8005a3dd #3 [ffff81081e87bf50] sys_nanosleep at ffffffff80054c2b #4 [ffff81081e87bf80] system_call at ffffffff8005d116 RIP: 00002b11279683c0 RSP: 00007fffabc18100 RFLAGS: 00010297 RAX: 0000000000000023 RBX: ffffffff8005d116 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00007fffabc18800 RDI: 00007fffabc18800 RBP: 00000000ffffffff R8: 0000000000000000 R9: 00007fffabc18660 R10: 0000000000000008 R11: 0000000000000246 R12: 00007fffabc18780 R13: 00007fffabc18780 R14: 0000000000000000 R15: 000000000000003c ORIG_RAX: 0000000000000023 CS: 0033 SS: 002b
Environment
- Red Hat Enterprise Linux 5 .5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
