What is the use of "kernel.panic_on_rcu_stall" parameter?
Environment
- Red Hat Enterprise Linux 8
Issue
- How do I configure panic on rcu_stall messages?
- What is the use of kernel.panic_on_rcu_stall parameter?
- How to set panic_on_rcu_stall parameter ?
- Host displays rcu_stall messages. How to generate vmcore automatically?
Resolution
RCU (read-copy update) is a kernel synchronization mechanism that increases a Linux system parallelism by enabling the concurrent access of readers and writers to a given shared data. Although RCU readers and writers are always allowed to access a shared data, writers are not allowed to free dynamically allocated data that was modified before the end of the grace-period. The end of a grace period ensures that no readers are accessing the old version of dynamically allocated shared data, allowing writers to return the memory to the system safely. Hence, a drawback of RCU is that a long wait for the end of a grace period can lead the system to run out-of-memory.
To warn that a grace-period is taking too long to occur, RCU Stalls messages are printed to the kernel log, notifying that the wait for the end of the grace period is taking more than the defined timeout. By default, the timeout is 60 seconds.
Although a RCU Stall can be a side effect of a kernel BUG, this is not the typical case for the real-time kernel users. In the vast majority of cases, real-time users face RCU stalls due to the delay of RCU callbacks execution. The RCU callbacks are responsible for performing the necessary RCU work to achieve the end of a grace period.
-
This leads to the logging of rcu_sched detected stalls on CPUs/tasks messages.
-
In order to get a vmcore to analyse the cause of the rcu_stall messages, it is necessary to configure the system to panic on rcu_stall events.
-
To configure this temporarily on the command line:
# echo 1 > /proc/sys/kernel/panic_on_rcu_stall -
To configure this permanently in the sysctl configuration file:
# echo "kernel.panic_on_rcu_stall = 1" >> /etc/sysctl.conf # sysctl -p -
When the kernel.panic_on_rcu_stall is set to 1, it calls panic() after RCU stall detection messages. This is useful to define the root cause of RCU stalls using a vmcore.
-
Analyze the vmcore for the root cause of RCU stalls.
-
Configuring kdump
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments