RHEL7.9 standard (non-RT) kernel and RHEL7.9 Realtime (kernel-rt) crash due to blocked task detection. The blocked task is stuck in unregister_shrinker() trying to take the shrinker_rwsem() that has been taken in shrink_slab() by another tasks.
Issue
- RHEL7.9 Realtime (kernel-rt) crashes due to a blocked task detection. The blocked task is stuck in unregister_shrinker() trying to take shrinker_rwsem in write mode where multiple other tasks are fighting over a dentry's d_lockref lock rt_mutex with that shrinker_rwsem being held in read mode. Those tasks have taken that shrinker_rwsem in shrink_slab().
[668817.440247] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[668817.448256] systemd-machine D ffffa011f67e8000 0 2408 1 0x00000082
[668817.456018] Call Trace:
[668817.458570] [<ffffffffb777c690>] schedule+0x30/0x96
[668817.463677] [<ffffffffb777dcbd>] __down_write_common+0xed/0x140
[668817.469775] [<ffffffffb777de93>] __down_write+0x13/0x20
[668817.475182] [<ffffffffb777cdfe>] down_write+0xe/0x10
[668817.480348] [<ffffffffb71cb7b9>] unregister_shrinker+0x19/0x40
[668817.486366] [<ffffffffb723ef31>] deactivate_locked_super+0x41/0x70
[668817.492730] [<ffffffffb723f6d6>] deactivate_super+0x46/0x60
[668817.498566] [<ffffffffb726030f>] cleanup_mnt+0x3f/0x80
[668817.504646] [<ffffffffb72603a2>] __cleanup_mnt+0x12/0x20
[668817.510353] [<ffffffffb70b570b>] task_work_run+0xbb/0xe0
[668817.516679] [<ffffffffb70924bf>] do_exit+0x2df/0xa50
[668817.521834] [<ffffffffb7105d6e>] ? up_read+0xe/0x10
[668817.526897] [<ffffffffb7783fe8>] ? __do_page_fault+0x238/0x5a0
[668817.532910] [<ffffffffb7092cbc>] do_group_exit+0x4c/0xd0
[668817.538405] [<ffffffffb7092d54>] SyS_exit_group+0x14/0x20
[668817.543990] [<ffffffffb77894a8>] tracesys+0xa6/0xcc
[668817.555195] Kernel panic - not syncing: hung_task: blocked tasks
[668817.555199] CPU: 40 PID: 810 Comm: khungtaskd Kdump: loaded Tainted: G OE ------------ T 3.10.0-1160.11.1.rt56.1145.el7.x86_64 #1
[668817.555200] Hardware name: Quanta Cloud Technology Inc. QuantaGrid D52BE-2U 1S5BU9Z0045/S5BE-MB 3UPI (LBG-1G), BIOS 3A11.BT20 09/20/2019
[668817.555201] Call Trace:
[668817.555212] [<ffffffffb7776f05>] dump_stack+0x19/0x1b
[668817.555217] [<ffffffffb7771065>] panic+0xe8/0x21f
[668817.555224] [<ffffffffb7144ad0>] watchdog+0x2b0/0x330
[668817.555228] [<ffffffffb7144820>] ? reset_hung_task_detector+0x20/0x20
[668817.555233] [<ffffffffb70b9261>] kthread+0xd1/0xe0
[668817.555236] [<ffffffffb70b9190>] ? kthread_worker_fn+0x170/0x170
[668817.555240] [<ffffffffb7789077>] ret_from_fork_nospec_begin+0x21/0x21
[668817.555244] [<ffffffffb70b9190>] ? kthread_worker_fn+0x170/0x170
- The same issue is observed on RHEL7.9 standard (non-RT) kernel.
Environment
-
Red Hat Enterprise Linux 7.9.z Realtime (kernel-rt-3.10.0-1160.11.1.rt56.1145.el7)
-
Red Hat Enterprise Linux 7.9.z standard (non-RT) kernel-3.10.0-1160.6.1.el7
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.