Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu xx due to looping in 'sched_cfs_period_timer'
Issue
Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu xx...
Typical stack trace:
<0>Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 16
<4>Pid: 0, comm: swapper Tainted: P --------------- 2.6.32-358.11.1.el6.x86_64 #1
<4>Call Trace:
<4> <NMI> [<ffffffff8150d4f8>] ? panic+0xa7/0x16f
<4> [<ffffffff810e112d>] ? watchdog_overflow_callback+0xcd/0xd0
<4> [<ffffffff81116f30>] ? __perf_event_overflow+0xb0/0x2a0
<4> [<ffffffff8101a89d>] ? x86_perf_event_update+0x5d/0xb0
<4> [<ffffffff8101b82d>] ? x86_perf_event_set_period+0xdd/0x170
<4> [<ffffffff81117554>] ? perf_event_overflow+0x14/0x20
<4> [<ffffffff810208c2>] ? intel_pmu_handle_irq+0x192/0x300
<4> [<ffffffff815130d6>] ? kprobe_exceptions_notify+0x16/0x430
<4> [<ffffffff81511c49>] ? perf_event_nmi_handler+0x39/0xb0
<4> [<ffffffff81513705>] ? notifier_call_chain+0x55/0x80
<4> [<ffffffff8151376a>] ? atomic_notifier_call_chain+0x1a/0x20
<4> [<ffffffff8109cc1e>] ? notify_die+0x2e/0x30
<4> [<ffffffff815113cb>] ? do_nmi+0x1bb/0x340
<4> [<ffffffff81510c90>] ? nmi+0x20/0x30
<4> [<ffffffff810657e1>] ? enqueue_entity+0x1/0x410
<4> <<EOE>> <IRQ> [<ffffffff81065dfb>] ? unthrottle_cfs_rq+0x10b/0x190
<4> [<ffffffffa034e62d>] ? __stp_time_timer_callback+0xbd/0xe0 [stap_13e229648ae4daabe7ad13af6149179_105193]
<4> [<ffffffff81065f2b>] ? distribute_cfs_runtime+0xab/0xd0
<4> [<ffffffff8106614d>] ? sched_cfs_period_timer+0x11d/0x160
<4> [<ffffffff81066030>] ? sched_cfs_period_timer+0x0/0x160
<4> [<ffffffff8109b3ae>] ? __run_hrtimer+0x8e/0x1a0
<4> [<ffffffff810a209f>] ? ktime_get_update_offsets+0x4f/0xd0
<4> [<ffffffff8109b716>] ? hrtimer_interrupt+0xe6/0x260
<4> [<ffffffff815172ab>] ? smp_apic_timer_interrupt+0x6b/0x9b
<4> [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
<4> <EOI> [<ffffffff812d39fe>] ? intel_idle+0xde/0x170
<4> [<ffffffff812d39e1>] ? intel_idle+0xc1/0x170
<4> [<ffffffff814152d7>] ? cpuidle_idle_call+0xa7/0x140
<4> [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
<4> [<ffffffff8150704c>] ? start_secondary+0x2ac/0x2ef
Statistically the crashes are more frequent on high-power/high-RAM machines with RHEL 6.6.
And on the same hardware 6.6 is ~ 8X more suceptible to this problem compared to RHEL 6.4
Environment
- Red Hat Enterprise Linux (RHEL) 6.6
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
