The kernel-rt is getting hung up and crashes due to softlockup_panic knob being enabled where several tasks are waiting for one CPU to handle IPIs for TLB flushing.
Issue
- The kernel-rt is getting hung up and crashes due to softlockup_panic knob being enabled where several tasks are waiting for one CPU to handle IPIs for TLB flushing.
[123444.036251] Kernel panic - not syncing: softlockup: hung tasks
[123444.036252] 00
[123444.036253]
[123444.036255] CPU: 25 PID: 43873 Comm: kubelet Kdump: loaded Tainted: G OEL ------------ T 3.10.0-1127.18.2.rt56.1116.el7.x86_64 #1
[123444.036256] Hardware name: xxxx
[123444.036256] Call Trace:
[123444.036257] 28 00 00
[123444.036261] <IRQ> [<ffffffff8b575f91>] dump_stack+0x19/0x1b
[123444.036264] [<ffffffff8b57014a>] panic+0xe8/0x21f
[123444.036268] [<ffffffff8af458e1>] watchdog_timer_fn+0x231/0x240
[123444.036270] [<ffffffff8af456b0>] ? watchdog+0x40/0x40
[123444.036272] [<ffffffff8aebe041>] __hrtimer_run_queues+0x121/0x3b0
[123444.036275] [<ffffffff8aebef99>] hrtimer_interrupt+0xb9/0x270
[123444.036277] [<ffffffff8af88a35>] ? irq_work_run_list+0x45/0x70
[123444.036280] [<ffffffff8ae4ccab>] local_apic_timer_interrupt+0x3b/0x60
[123444.036283] [<ffffffff8b58c6a3>] smp_apic_timer_interrupt+0x43/0x60
[123444.036285] [<ffffffff8b5891ba>] apic_timer_interrupt+0x16a/0x170
[123444.036288] <EOI> [<ffffffff8af0c2f2>] ? generic_exec_single+0x102/0x1c0
[123444.036291] [<ffffffff8ae6dde0>] ? leave_mm+0x130/0x130
[123444.036293] [<ffffffff8ae6dde0>] ? leave_mm+0x130/0x130
[123444.036295] [<ffffffff8af0c422>] smp_call_function_single+0x72/0xb0
[123444.036297] [<ffffffff8ae6dde0>] ? leave_mm+0x130/0x130
[123444.036299] [<ffffffff8af0ca5a>] smp_call_function_many+0x23a/0x280
[123444.036302] [<ffffffff8ae6dfa8>] native_flush_tlb_others+0xb8/0xc0
[123444.036304] [<ffffffff8ae6e02a>] flush_tlb_mm_range+0x7a/0x180
[123444.036307] [<ffffffff8afec128>] zap_page_range+0xd8/0x150
[123444.036310] [<ffffffff8afe67f5>] SyS_madvise+0x4b5/0xac0
[123444.036312] [<ffffffff8aebed19>] ? __hrtimer_nanosleep+0xc9/0x190
[123444.036315] [<ffffffff8af0a3a0>] ? SyS_futex+0x80/0x190
[123444.036317] [<ffffffff8b588428>] tracesys+0xa6/0xcc
PID: 43873 TASK: ffff972a1051b3f0 CPU: 25 COMMAND: "kubelet"
#0 [ffff973b1cc43cf8] machine_kexec at ffffffff8ae56404
#1 [ffff973b1cc43d58] __crash_kexec at ffffffff8af18a22
#2 [ffff973b1cc43e28] panic at ffffffff8b570155
#3 [ffff973b1cc43ea8] watchdog_timer_fn at ffffffff8af458e1
#4 [ffff973b1cc43ee0] __hrtimer_run_queues at ffffffff8aebe041
#5 [ffff973b1cc43f60] hrtimer_interrupt at ffffffff8aebef99
#6 [ffff973b1cc43fc0] local_apic_timer_interrupt at ffffffff8ae4ccab
#7 [ffff973b1cc43fd8] smp_apic_timer_interrupt at ffffffff8b58c6a3
#8 [ffff973b1cc43ff0] apic_timer_interrupt at ffffffff8b5891ba
--- <IRQ stack> ---
#9 [ffff972d250f3b88] apic_timer_interrupt at ffffffff8b5891ba
[exception RIP: generic_exec_single+258]
RIP: ffffffff8af0c2f2 RSP: ffff972d250f3c30 RFLAGS: 00000202
RAX: ffff9725bec7fb00 RBX: ffff97260ff73dc0 RCX: ffff9725bec7fb50
RDX: ffff973b1c61e4c0 RSI: ffff972d250f3c30 RDI: ffff972d250f3c30
RBP: ffff972d250f3c78 R8: 0000000000000001 R9: 0000000000f51d6e
R10: ffff97343e7a61c0 R11: 0000000000000246 R12: 0000000080ca5eb1
R13: ffff972d250f3bf0 R14: ffffffff8aed9be1 R15: ffff972d250f3bd0
ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018
#10 [ffff972d250f3c80] smp_call_function_single at ffffffff8af0c422
#11 [ffff972d250f3cb8] smp_call_function_many at ffffffff8af0ca5a
#12 [ffff972d250f3d00] native_flush_tlb_others at ffffffff8ae6dfa8
#13 [ffff972d250f3d50] flush_tlb_mm_range at ffffffff8ae6e02a
Environment
- Red Hat Enterprise Linux 7 for Real Time
- kernel-3.10.0-1127.18.2.rt56.1116.el7.x86_64
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.