Lots of percpu_rw_semaphore readers are blocking before starting those critical sections whereas the writer sets the sem->block to 1 for writer-writer exclusion then waiting for all active readers to complete. A possible cgroup_threadgroup_rwsem deadlock.

Solution Unverified - Updated -

Issue

  • Run the cpu-hogging thread on the isolated CPU, CPU X, and then run another command on the same CPU with taskset command taskset -c X <command>.

    • The command doesn't run on CPU X, which is expected.
    • However, the entire system gets hung at the same time.
    • The system becomes functional again just after the thread that is hogging the CPU X exits.
  • During the timeframe when the system is getting hung up,

    • The 79 tasks are stuck in TASK_UNINTERRUPTIBLE sleep state with the backtraces just like these:
 #0 [ffffa49425b0fca0] __schedule at ffffffffa9f570ed
 #1 [ffffa49425b0fd40] schedule at ffffffffa9f57676
 #2 [ffffa49425b0fd50] percpu_rwsem_wait at ffffffffa9738a4d
 #3 [ffffa49425b0fdb8] __percpu_down_read at ffffffffa9738af0
 #4 [ffffa49425b0fdd0] copy_process at ffffffffa96db321
 #5 [ffffa49425b0fe98] _do_fork at ffffffffa96db48e

 #0 [ffffa494271d3d38] __schedule at ffffffffa9f570ed
 #1 [ffffa494271d3dd8] schedule at ffffffffa9f57676
 #2 [ffffa494271d3de8] percpu_rwsem_wait at ffffffffa9738a4d
 #3 [ffffa494271d3e50] __percpu_down_read at ffffffffa9738af0
 #4 [ffffa494271d3e68] exit_signals at ffffffffa96f06a2
 #5 [ffffa494271d3e88] do_exit at ffffffffa96e17e0
  • Another one:
 #0 [ffffa4940feffc20] __schedule at ffffffffa9f570ed
 #1 [ffffa4940feffcc0] schedule at ffffffffa9f57676
 #2 [ffffa4940feffcd0] __rt_mutex_slowlock at ffffffffa9f5913e
 #3 [ffffa4940feffd10] rt_mutex_slowlock_locked at ffffffffa9f5926c
 #4 [ffffa4940feffd68] rt_mutex_slowlock.constprop.31 at ffffffffa9f5946f
 #5 [ffffa4940feffde8] proc_cgroup_show at ffffffffa97a0f4a
 #6 [ffffa4940feffe30] proc_single_show at ffffffffa999f381
 #7 [ffffa4940feffe68] seq_read at ffffffffa99430c3
  • One more another one:
 #0 [ffffa49428307d08] __schedule at ffffffffa9f570ed
 #1 [ffffa49428307da8] schedule at ffffffffa9f57676
 #2 [ffffa49428307db8] percpu_down_write at ffffffffa9738bf9
 #3 [ffffa49428307de8] cgroup_procs_write_start at ffffffffa979d08c
 #4 [ffffa49428307e18] __cgroup1_procs_write.constprop.15 at ffffffffa97a3257
 #5 [ffffa49428307e60] cgroup_file_write at ffffffffa97999fb
 #6 [ffffa49428307e98] kernfs_fop_write at ffffffffa99b1586
 #7 [ffffa49428307ed0] vfs_write at ffffffffa9918e25
 #8 [ffffa49428307f00] ksys_write at ffffffffa99190b2

Environment

  • Red Hat Enterprise Linux 8 Realtime kernel-rt-4.18.0-305.12.1.rt7.84.el8_4.x86_64
  • The application thread that hogs the isolated CPU.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content