Timer tree corruption leads to missing wakeup and system freeze
Issue
- What is CVE-2021-20317.
- The Host server was hanged in a certain situation while the VM is running/destroying.
- The issue happens with below logs.
[ 1940.772191] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1940.780019] kworker/37:2 D 0 2875 2 0x80084080
[ 1940.780028] Workqueue: events slab_caches_to_rcu_destroy_workfn
[ 1940.780029] Call Trace:
[ 1940.780036] ? __schedule+0x26d/0x660
[ 1940.780040] schedule+0x2f/0xa0
[ 1940.780042] schedule_timeout+0x246/0x2f0
[ 1940.780047] ? __queue_work+0x103/0x3f0
[ 1940.780049] ? __switch_to_asm+0x41/0x70
[ 1940.780051] wait_for_completion+0x11f/0x190
[ 1940.780054] ? wake_up_q+0x70/0x70
[ 1940.780058] rcu_barrier+0x17e/0x1e0
[ 1940.780060] slab_caches_to_rcu_destroy_workfn+0x8f/0xe0
[ 1940.780062] process_one_work+0x1a7/0x3b0
[ 1940.780063] worker_thread+0x30/0x390
[ 1940.780066] ? create_worker+0x1a0/0x1a0
[ 1940.780067] kthread+0x112/0x130
[ 1940.780069] ? kthread_flush_work_fn+0x10/0x10
[ 1940.780070] ret_from_fork+0x1f/0x40
[ 2142.705206] INFO: task perf:3201 blocked for more than 120 seconds.
[ 2142.711493] Tainted: G W --------- - - 4.18.0-193.64.1.el8_2.x86_64 #1
[ 2142.719838] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2142.727673] perf D 0 3201 2724 0x00080084
[ 2142.727674] Call Trace:
[ 2142.727677] ? __schedule+0x26d/0x660
[ 2142.727678] schedule+0x2f/0xa0
[ 2142.727681] schedule_timeout+0x193/0x2f0
[ 2142.727687] ? __next_timer_interrupt+0xf0/0xf0
[ 2142.727688] msleep+0x29/0x30
[ 2142.727693] cpuinfo_open+0xe/0x20
[ 2142.727698] proc_reg_open+0x71/0x130
[ 2142.727699] ? proc_alloc_inode+0x60/0x60
[ 2142.727702] do_dentry_open+0x132/0x330
[ 2142.727705] path_openat+0x573/0x14d0
[ 2142.727708] ? iomap_file_buffered_write+0x62/0x90
[ 2142.727709] do_filp_open+0x93/0x100
[ 2142.727712] ? __check_object_size+0xa8/0x16b
[ 2142.727714] do_sys_open+0x184/0x220
[ 2142.727716] do_syscall_64+0x5b/0x1a0
[ 2142.727717] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 2142.727718] RIP: 0033:0x7fc2c1dfa861
[ 2142.659039] Workqueue: events slab_caches_to_rcu_destroy_workfn
[ 2142.659043] Call Trace:
[ 2142.659045] ? rcu_barrier+0x1e0/0x1e0
[ 2142.659051] kthread+0x112/0x130
[ 2142.659055] ? kthread_flush_work_fn+0x10/0x10
[ 2142.659057] ret_from_fork+0x1f/0x40
[ 2142.659063] ? __schedule+0x26d/0x660
[ 2142.659066] schedule+0x2f/0xa0
[ 2142.659068] schedule_timeout+0x246/0x2f0
[ 2798.316557] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 2798.317524] INFO: task kworker/37:2:2875 blocked for more than 120 seconds.
[ 2798.322480] rcu: 0-...!: (22 GPs behind) idle=a42/1/0x4000000000000000 softirq=42781/42781 fqs=0
[ 2798.322482] rcu: 2-...!: (10873 GPs behind) idle=dd4/0/0x0 softirq=0/0 fqs=0
[ 2798.322484] rcu: 3-...!: (10872 GPs behind) idle=d00/0/0x0 softirq=0/0 fqs=0
[ 2798.322485] rcu: 4-...!: (10871 GPs behind) idle=d1c/0/0x0 softirq=0/0 fqs=0
[ 2798.322487] rcu: 5-...!: (10870 GPs behind) idle=cf8/0/0x0 softirq=0/0 fqs=0
[ 2798.322488] rcu: 6-...!: (10869 GPs behind) idle=cbc/0/0x0 softirq=0/0 fqs=0
[ 2798.322490] rcu: 7-...!: (10869 GPs behind) idle=c98/0/0x0 softirq=0/0 fqs=0
[ 2798.322491] rcu: 8-...!: (10868 GPs behind) idle=cb0/0/0x0 softirq=0/0 fqs=0
[ 2798.322492] rcu: 9-...!: (10867 GPs behind) idle=c80/0/0x0 softirq=0/0 fqs=0
[ 2798.322497] rcu: 10-...!: (10866 GPs behind) idle=c30/0/0x0 softirq=0/0 fqs=0
[ 2798.656675] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2798.663984] dump_stack+0x5c/0x80
[ 2798.663988] nmi_cpu_backtrace.cold.5+0x13/0x4e
[ 2798.671287] kworker/0:1 D 0 2881 2 0x80084080
[ 2798.678595] ? lapic_can_unplug_cpu.cold.25+0x3b/0x3b
[ 2798.678598] nmi_trigger_cpumask_backtrace+0xde/0xe0
[ 2798.685932] Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
[ 2798.692081] rcu_dump_cpu_stacks+0x9c/0xca
[ 2798.692085] rcu_sched_clock_irq.cold.69+0x29b/0x35e
[ 2798.698949] Call Trace:
[ 2798.707298] ? tick_sched_do_timer+0x60/0x60
[ 2798.707302] update_process_times+0x28/0x60
[ 2798.715124] tick_sched_handle+0x22/0x60
[ 2798.715126] ? __schedule+0x26d/0x660
[ 2798.715129] schedule+0x2f/0xa0
[ 2798.715132] schedule_timeout+0x246/0x2f0
[ 2798.715134] tick_sched_timer+0x37/0x70
[ 2798.715136] __hrtimer_run_queues+0x100/0x280
[ 2798.715138] ? internal_add_timer+0x42/0x60
[ 2798.715140] ? add_timer+0x13f/0x1f0
[ 2798.715142] wait_for_completion+0x11f/0x190
[ 2798.715145] hrtimer_interrupt+0x100/0x220
[ 2798.715147] ? wake_up_q+0x70/0x70
[ 2798.715151] smp_apic_timer_interrupt+0x6a/0x140
[ 2798.715153] __synchronize_srcu.part.16+0x81/0xb0
[ 2798.715156] apic_timer_interrupt+0xf/0x20
[ 2798.715158] ? __bpf_trace_rcu_utilization+0x10/0x10
[ 2798.715159] </IRQ>
[ 2798.715163] RIP: 0010:cpuidle_enter_state+0xbc/0x420
[ 2798.715175] irqfd_shutdown+0x38/0xa0 [kvm]
Environment
- Red Hat Enterprise Linux 8
- Red Hat Openstack Platform 16.1.5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.