The kernel-rt crashes in a condition just like a hard lockup involving __start_cfs_bandwidth()

Solution Verified - Updated -

Issue

  • The kernel crashed in a condition just like a hard lockup without hard lockup message.
crash> log | tail
[4305042.346407] type=1302 audit(1645754444.705:236896737): item=0 name="/proc/self/fd/5" inode=268436855 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00 objtype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
[4305042.367065] type=1302 audit(1645754444.705:236896737): item=1 name="/lib64/ld-linux-x86-64.so.2" inode=537539313 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00 objtype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
[4305042.388764] type=1327 audit(1645754444.705:236896737): proctitle=72756E6300696E6974
[4305042.447049] type=1300 audit(1645754444.878:236896740): arch=c000003e syscall=59 success=yes exit=0 a0=c0001fc770 a1=c000200f20 a2=c0001a5500 a3=0 items=2 ppid=274264 pid=274295 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="bash" exe="/usr/bin/bash" key="rootcmd"
[4305042.475422] type=1321 audit(1645754444.878:236896740): fver=0 fp=0000000000000000 fi=0000000000000000 fe=0 old_pp=00000000a80425fb old_pi=00000000a80425fb old_pe=00000000a80425fb old_pa=0000000000000000 pp=00000000a80425fb pi=00000000a80425fb pe=00000000a80425fb pa=0000000000000000
[5080110.006277] IPv6: ADDRCONF(NETDEV_UP): l1l2-int-host: link is not ready
[5080110.015717] IPv6: ADDRCONF(NETDEV_CHANGE): l1l2-int-host: link becomes ready
[5080112.354704] igb_uio 0000:1e:00.0: irq 548 for MSI/MSI-X
[5080112.354739] igb_uio 0000:1e:00.0: uio device registered with irq 548
[5080135.634363] vfio-pci 0000:19:02.7: enabling device (0000 -> 0002)

crash> bt -p
PID: 83975  TASK: ffffa12e56d6d6e0  CPU: 24  COMMAND: "rdk:broker8"
 #0 [ffffa1381cc09978] machine_kexec at ffffffffb1e56464
 #1 [ffffa1381cc099d8] __crash_kexec at ffffffffb1f18342
 #2 [ffffa1381cc09aa8] panic at ffffffffb2572150
 #3 [ffffa1381cc09b28] nmi_panic at ffffffffb1e8b29f
 #4 [ffffa1381cc09b38] watchdog_overflow_callback at ffffffffb1f45628
 #5 [ffffa1381cc09b58] __perf_event_overflow at ffffffffb1fa2697
 #6 [ffffa1381cc09b90] perf_event_overflow at ffffffffb1fac424
 #7 [ffffa1381cc09ba0] handle_pmi_common at ffffffffb1e09a00
 #8 [ffffa1381cc09dd8] intel_pmu_handle_irq at ffffffffb1e09cdf
 #9 [ffffa1381cc09e30] perf_event_nmi_handler at ffffffffb2581031
#10 [ffffa1381cc09e50] nmi_handle at ffffffffb2582904
#11 [ffffa1381cc09ea8] do_nmi at ffffffffb2582b1d
#12 [ffffa1381cc09ef0] end_repeat_nmi at ffffffffb2581d4c
    [exception RIP: lock_hrtimer_base+51]
    RIP: ffffffffb1ebce83  RSP: ffffa1381cc03cc8  RFLAGS: 00000046
    RAX: 0000000000000083  RBX: ffffa1381c6576a0  RCX: 0000000000000001
    RDX: 0000000000000001  RSI: ffffa1381cc03cf0  RDI: ffffa1381c657620
    RBP: ffffa1381cc03ce0   R8: 0000000000000101   R9: 0000000000000133
    R10: 00000000000000a0  R11: 0000000000000005  R12: ffffa12e3254cd28
    R13: ffffa1381cc03cf0  R14: ffffa12e3254ccc0  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
#13 [ffffa1381cc03cc8] lock_hrtimer_base at ffffffffb1ebce83
#14 [ffffa1381cc03ce8] hrtimer_try_to_cancel at ffffffffb1ebd413
#15 [ffffa1381cc03d20] __start_cfs_bandwidth at ffffffffb1ed7556
#16 [ffffa1381cc03d48] __account_cfs_rq_runtime at ffffffffb1ed7611
#17 [ffffa1381cc03d78] update_curr at ffffffffb1ed7766
#18 [ffffa1381cc03db8] update_cfs_shares at ffffffffb1ed7be8
#19 [ffffa1381cc03de0] task_tick_fair at ffffffffb1ed8c7c
#20 [ffffa1381cc03e38] scheduler_tick at ffffffffb1ecd724
#21 [ffffa1381cc03e70] update_process_times at ffffffffb1ea1111
#22 [ffffa1381cc03e98] tick_sched_handle at ffffffffb1f042e0
#23 [ffffa1381cc03eb8] tick_sched_timer at ffffffffb1f04649
#24 [ffffa1381cc03ee0] __hrtimer_run_queues at ffffffffb1ebd631
#25 [ffffa1381cc03f60] hrtimer_interrupt at ffffffffb1ebe589
#26 [ffffa1381cc03fc0] local_apic_timer_interrupt at ffffffffb1e4cdab
#27 [ffffa1381cc03fd8] smp_apic_timer_interrupt at ffffffffb258e723
#28 [ffffa1381cc03ff0] apic_timer_interrupt at ffffffffb258b23a
--- <IRQ stack> ---
    ...

Environment

  • Red Hat Enterprise Linux 7.9.z Realtime
    • kernel-rt-3.10.0-1160.24.1.rt56.1161.el7.x86_64

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content