[RT kernel] Process is set to RU state but stalled in CPU .
Environment
- Red Hat Enterprise Linux 8.5
- kernel-rt 4.18.0-348.23.1.rt7.153.el8_5
- Red Hat Enterprise Linux 8.6
- Red Hat Enterprise Linux 8.7
Issue
- A specific process is set to RU state but is not executing on a CPU nor is it in any CPU's runqueue.
PID: 931918 COMMAND: "sh" TASK: ffff959a97d63e80 CPU: 62 STATE: TASK_RUNNING
crash> runq -c 62
CPU 62 RUNQUEUE: ffff959fbf9ac2c0
CURRENT: PID: 0 TASK: ffff959003200000 COMMAND: "swapper/62"
RT PRIO_ARRAY: ffff959fbf9ac500
[no tasks queued]
CFS RB_ROOT: ffff959fbf9ac370
[no tasks queued]
Resolution
Red Hat Enterprise Linux 8.6
- Update the kernel-rt to
kernel-rt-4.18.0-372.36.1.rt7.193.el8_6
. The package is available through repo rhocp-4.11-for-rhel-8-x86_64-rpms
Red Hat Enterprise Linux 8.7
- Update the kernel-rt to
kernel-rt-4.18.0-425.13.1.rt7.223.el8_7
through Errata:RHSA-2023:0854.
Root Cause
- The deadly embrace is brought about by the CFS throttling mechanism waiting to add processes to the various CPU run queues which will never occur as the lock that prevents the softirqs from being auctioned is held waiting for another lock to be released once the owner gets CPU time.
Diagnostic Steps
CPUS: 80
DATE: Wed Jul 13 02:08:34 EDT 2022
UPTIME: 19:23:20
LOAD AVERAGE: 94.33, 77.98, 48.58
TASKS: 2808
NODENAME: node1
RELEASE: 4.18.0-348.23.1.rt7.153.el8_5.x86_64
VERSION: #1 SMP PREEMPT_RT Tue Apr 12 14:01:36 EDT 2022
MACHINE: x86_64 (1600 Mhz)
MEMORY: 127.7 GB
PANIC: "Kernel panic - not syncing: hung_task: blocked tasks"
PID: 699
COMMAND: "khungtaskd"
TASK: ffff95881ac43e80 [THREAD_INFO: ffff95881ac43e80]
CPU: 0
STATE: TASK_RUNNING (PANIC)
-
The
sh
process is currently in Runnable state [RU]. As it is the next waiter in the list, we know this task will acquire the lock the next time it runs on the CPU:PID: 931918 COMMAND: "sh" TASK: ffff959a97d63e80 CPU: 62 STATE: TASK_RUNNING PID: 517 COMMAND: "rcuop/62" TASK: ffff959003778000 CPU: 1 STATE: TASK_UNINTERRUPTIBLE PID: 931805 COMMAND: "runc" TASK: ffff959ab41cddc0 CPU: 61 STATE: TASK_UNINTERRUPTIBLE PID: 931793 COMMAND: "conmon" TASK: ffff9586bb0e3e80 CPU: 21 STATE: TASK_UNINTERRUPTIBLE PID: 7983 COMMAND: "crio" TASK: ffff959b012b3e80 CPU: 20 STATE: TASK_UNINTERRUPTIBLE PID: 771529 COMMAND: "crio" TASK: ffff9580f5360000 CPU: 60 STATE: TASK_UNINTERRUPTIBLE PID: 878 COMMAND: "kworker/2:1" TASK: ffff95808cae9f40 CPU: 2 STATE: TASK_UNINTERRUPTIBLE PID: 3578758 COMMAND: "kworker/0:2" TASK: ffff9587a4945dc0 CPU: 0 STATE: TASK_UNINTERRUPTIBLE PID: 926300 COMMAND: "kworker/41:2" TASK: ffff9580e79fddc0 CPU: 41 STATE: TASK_UNINTERRUPTIBLE PID: 1791015 COMMAND: "iperf" TASK: ffff9590651d9f40 CPU: 36 STATE: TASK_UNINTERRUPTIBLE
-
Currently PID 931918
sh
appears to still be waiting in schedule to be started on a CPU.
crash> bt 931918
PID: 931918 TASK: ffff959a97d63e80 CPU: 62 COMMAND: "sh"
#0 [ffffb4de2886fa18] __schedule at ffffffffa5f85fdf
#1 [ffffb4de2886faa8] preempt_schedule_lock at ffffffffa5f86739
#2 [ffffb4de2886fab0] rt_spin_lock_slowlock_locked at ffffffffa5f8956e
#3 [ffffb4de2886faf0] rt_spin_lock_slowlock at ffffffffa5f89770
#4 [ffffb4de2886fb60] rt_spin_lock at ffffffffa5f8bb8a
#5 [ffffb4de2886fb68] get_page_from_freelist at ffffffffa58db53f
#6 [ffffb4de2886fc88] __alloc_pages_nodemask at ffffffffa58dd9e5
#7 [ffffb4de2886fcf8] alloc_pages_vma at ffffffffa58f8ab5
#8 [ffffb4de2886fd38] wp_page_copy at ffffffffa58b9a49
#9 [ffffb4de2886fda8] do_wp_page at ffffffffa58bc33f
#10 [ffffb4de2886fde8] __handle_mm_fault at ffffffffa58bded3
#11 [ffffb4de2886fe90] handle_mm_fault at ffffffffa58be282
#12 [ffffb4de2886feb8] __do_page_fault at ffffffffa566e776
#13 [ffffb4de2886ff18] do_page_fault at ffffffffa566eb24
#14 [ffffb4de2886ff50] page_fault at ffffffffa600109e
RIP: 00007f0997da34a6 RSP: 00007fffd0f0bb60 RFLAGS: 00010202
RAX: 000055bb5d8c1fa0 RBX: 000055bb5d8c39e0 RCX: 000055bb5dadbf50
RDX: 000055bb5d8c4b20 RSI: 00007fffd0f0bc00 RDI: 00007f0997fc09f0
RBP: 00007fffd0f0bc70 R8: 0000000000000000 R9: 0000000000000002
R10: 00007f0997fc1170 R11: 0000000000000206 R12: 0000000000000000
R13: 0000000000000000 R14: 000055bb5d8c0000 R15: 00007f0997fc1170
ORIG_RAX: ffffffffffffffff CS: 0033 SS: 002b
- However, this task is not currently on the runq specifically for CPU 62:
crash> runq -c 62
CPU 62 RUNQUEUE: ffff959fbf9ac2c0
CURRENT: PID: 0 TASK: ffff959003200000 COMMAND: "swapper/62"
RT PRIO_ARRAY: ffff959fbf9ac500
[no tasks queued]
CFS RB_ROOT: ffff959fbf9ac370
[no tasks queued]
Nor in fact, is the process found on the runq for any CPU:
crash> p runqueues:a | grep ffff959a97d63e80 | wc -l
0
- The task_struct for PID 931918
sh
does have theon_rq
flag set:
crash> struct task_struct ffff959a97d63e80
struct task_struct {
thread_info = {
flags = 0x80000,
status = 0x0,
preempt_lazy_count = 0x1
},
state = 0x0,
saved_state = 0x0,
stack = 0xffffb4de2886c000,
---------<snip>-------------------
on_cpu = 0x0,
cpu = 0x3e,
wakee_flips = 0x5,
wakee_flip_decay_ts = 0x1041cb050,
last_wakee = 0xffff959046671f40,
recent_used_cpu = 0x3e,
wake_cpu = 0x3e,
on_rq = 0x1, <--------------------------
prio = 0x78,
static_prio = 0x78,
normal_prio = 0x78,
rt_priority = 0x0,
sched_class = 0xffffffffa6423d80 <fair_sched_class>,
se = {
load = {
weight = 0x100000,
inv_weight = 0x400000
},
runnable_weight = 0x0,
run_node = {
__rb_parent_color = 0xffff959ab3c92018,
rb_right = 0x0,
rb_left = 0x0
},
group_node = {
next = 0xffff959ab44bdeb0,
prev = 0xffff959ab44b80f0
},
on_rq = 0x1,
exec_start = 0x3f031dc6c510,
sum_exec_runtime = 0xa8c64,
vruntime = 0x17544c4559026,
prev_sum_exec_runtime = 0x0,
nr_migrations = 0x0,
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments