RCU stall WARNING: at kernel/rcu/tree.c:1392 rcu_advance_cbs_nowake+0x51/0x60
Issue
- The system / ocp worker nodes getting hang after reporting below warning message.
[741381.244673] WARNING: CPU: 42 PID: 2110302 at kernel/rcu/tree.c:1392 rcu_advance_cbs_nowake+0x51/0x60 <<<---
[--]
[741381.407883] RIP: 0010:rcu_advance_cbs_nowake+0x51/0x60
[--]
[741381.516213] Call Trace:
[741381.519352] call_rcu+0x32d/0x490
[741381.527385] task_work_run+0x8a/0xb0
[741381.531671] exit_to_usermode_loop+0xeb/0xf0
[741381.536658] do_syscall_64+0x198/0x1a0
[--]
- Due to this some other tasks also getting stuck at RCU functionality causes a spike in load average.
crash> bt 126619
PID: 126619 TASK: ffff92db5473bd80 CPU: 3 COMMAND: "exe"
#0 [ffffad6f231c7c98] __schedule at ffffffffa9f49fac
#1 [ffffad6f231c7d28] schedule at ffffffffa9f4a448
#2 [ffffad6f231c7d38] schedule_timeout at ffffffffa9f4db86
#3 [ffffad6f231c7dd0] wait_for_completion at ffffffffa9f4ae27
#4 [ffffad6f231c7e10] __wait_rcu_gp at ffffffffa9758b72
#5 [ffffad6f231c7e50] synchronize_rcu at ffffffffa975fa86
#6 [ffffad6f231c7e98] namespace_unlock at ffffffffa993b347
#7 [ffffad6f231c7eb0] ksys_umount at ffffffffa993d8a4
#8 [ffffad6f231c7f30] __x64_sys_umount at ffffffffa993db22
#9 [ffffad6f231c7f38] do_syscall_64 at ffffffffa960420b
#10 [ffffad6f231c7f50] entry_SYSCALL_64_after_hwframe at ffffffffaa0000ad
- It is also observed that usage of
filpslab cache increases underUnreclaimable slab
[Fri Mar 3 06:02:23 +03 2023] Unreclaimable slab info:
[Fri Mar 3 06:02:23 +03 2023] Name Used Total
..
[Fri Mar 3 06:02:23 +03 2023] mnt_cache 1102KB 1102KB
[Fri Mar 3 06:02:23 +03 2023] filp 10808048KB 10808048KB <<<------
[Fri Mar 3 06:02:23 +03 2023] names_cache 3264KB 3264KB
Environment
- Red Hat Enterprise Linux 8.5
- Red Hat Enterprise Linux 8.4
- OpenShift Container Platform.
- OpenStack Platform 16.2.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.