Why we have soft lockup in multi_cpu_stop+0x7f ?
Issue
- Server hangs with following soft lockup error messages:
May 5 22:40:51 Host1 kernel: BUG: soft lockup - CPU#31 stuck for 23s! [migration/31:242]
May 5 22:41:19 Host1 kernel: BUG: soft lockup - CPU#8 stuck for 22s! [migration/8:115]
- During that time we have CPU 100 in %sys.
- We have soft lockup with following traces in the core
#3 [ffff88123ac03ea0] watchdog_timer_fn at ffffffff8110a4f5
#4 [ffff88123ac03ed0] __run_hrtimer at ffffffff8109b1a7
#5 [ffff88123ac03f10] hrtimer_interrupt at ffffffff8109b9e7
#6 [ffff88123ac03f80] local_apic_timer_interrupt at ffffffff810441c7
#7 [ffff88123ac03f98] smp_apic_timer_interrupt at ffffffff8161634f
#8 [ffff88123ac03fb0] apic_timer_interrupt at ffffffff81614a1d
--- <IRQ stack> ---
#9 [ffff881238847ce8] apic_timer_interrupt at ffffffff81614a1d
[exception RIP: multi_cpu_stop+0x7f]
RIP: ffffffff810f26df RSP: ffff881238847d90 RFLAGS: 00000293
RAX: ffffffff81633ce0 RBX: ffff881238847d20 RCX: dead000000200200
RDX: 0000000000000001 RSI: 0000000000000282 RDI: ffff880663277af0
RBP: ffff881238847db0 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000001 R11: 0000000000000008 R12: 000000000000001a
R13: 000000000000001f R14: ffff8814bfd13680 R15: ffff8814bfd13680
ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0000
#10 [ffff881238847db8] cpu_stopper_thread at ffffffff810f28e8
#11 [ffff881238847e80] smpboot_thread_fn at ffffffff8109fc7f
#12 [ffff881238847ec8] kthread at ffffffff8109726f
#13 [ffff881238847f50] ret_from_fork at ffffffff81613cfc
Environment
- Red Hat Enterprise Linux 7.1
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
