RHEL guest is getting unresponsive where massive soft lockups and rcu_sched CPU stalls are being encountered.

Solution Unverified - Updated -

Issue

  • RHEL guest is getting unresponsive where massive soft lockups and rcu_sched CPU stalls are being encountered.
[5951166.620004] INFO: rcu_sched self-detected stall on CPU { 62}  (t=158463512 jiffies g=822375595 c=822375594 q=51744570)
...
[5951166.620004] Task dump for CPU 62:
[5951166.620004] migration/62    R  running task        0   326      2 0x00000008
[5951166.620004] Call Trace:
[5951166.620004]  <IRQ>  [<ffffffffb60dab18>] sched_show_task+0xa8/0x110
[5951166.620004]  [<ffffffffb60de919>] dump_cpu_task+0x39/0x70
[5951166.620004]  [<ffffffffb6158b90>] rcu_dump_cpu_stacks+0x90/0xd0
[5951166.620004]  [<ffffffffb615c252>] rcu_check_callbacks+0x442/0x730
[5951166.620004]  [<ffffffffb6110ae0>] ? tick_sched_do_timer+0x50/0x50
[5951166.620004]  [<ffffffffb60af9d6>] update_process_times+0x46/0x80
[5951166.620004]  [<ffffffffb6110850>] tick_sched_handle+0x30/0x70
[5951166.620004]  [<ffffffffb6110b19>] tick_sched_timer+0x39/0x80
[5951166.620004]  [<ffffffffb60caa8e>] __hrtimer_run_queues+0x10e/0x270
[5951166.620004]  [<ffffffffb60cafef>] hrtimer_interrupt+0xaf/0x1d0
[5951166.620004]  [<ffffffffc02ea5d4>] hv_stimer0_isr+0x24/0x40 [hv_vmbus]
[5951166.620004]  [<ffffffffb679794c>] hv_stimer0_vector_handler+0x3c/0x70
[5951166.620004]  [<ffffffffb679699a>] hv_stimer0_callback_vector+0x16a/0x170
[5951166.620004]  <EOI>  [<ffffffffb613602a>] ? multi_cpu_stop+0x4a/0x110
[5951166.620004]  [<ffffffffb6135fe0>] ? cpu_stop_should_run+0x50/0x50
[5951166.620004]  [<ffffffffb61362e9>] cpu_stopper_thread+0x99/0x150
[5951166.620004]  [<ffffffffb6785922>] ? __schedule+0x402/0x840
[5951166.620004]  [<ffffffffb60cf0e4>] smpboot_thread_fn+0x144/0x1a0
[5951166.620004]  [<ffffffffb60cefa0>] ? lg_double_unlock+0x40/0x40
[5951166.620004]  [<ffffffffb60c6691>] kthread+0xd1/0xe0
[5951166.620004]  [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
[5951166.620004]  [<ffffffffb6792d24>] ret_from_fork_nospec_begin+0xe/0x21
[5951166.620004]  [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
...
[5951631.057005] NMI watchdog: BUG: soft lockup - CPU#44 stuck for 22s! [migration/44:234]
[5951631.057005] Modules linked in: fuse btrfs raid6_pq xor msdos udp_diag binfmt_misc xt_owner iptable_security unix_diag tcp_diag inet_diag xt_REDIRECT nf_nat_redirect iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_addrtype xt_multiport iptable_filter ext4 mbcache jbd2 nvme nvme_core joydev vfat fat crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr hv_utils pci_hyperv ptp sg pps_core hv_balloon i2c_piix4 ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi hv_storvsc scsi_transport_fc hid_hyperv ata_piix hv_netvsc hyperv_keyboard scsi_tgt crct10dif_pclmul hyperv_fb crct10dif_common libata crc32c_intel hv_vmbus floppy serio_raw dm_mirror dm_region_hash dm_log dm_mod
[5951631.057005] CPU: 44 PID: 234 Comm: migration/44 Kdump: loaded Tainted: G             L ------------ T 3.10.0-1127.el7.x86_64 #1
[5951631.057005] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  05/18/2018
[5951631.057005] task: ffff8df549f7b150 ti: ffff8df549f8c000 task.ti: ffff8df549f8c000
[5951631.057005] RIP: 0010:[<ffffffffb613602a>]  [<ffffffffb613602a>] multi_cpu_stop+0x4a/0x110
[5951631.057005] RSP: 0000:ffff8df549f8fd98  EFLAGS: 00000246
[5951631.057005] RAX: 0000000000000001 RBX: ffffffffb60d79bc RCX: dead000000000200
[5951631.057005] RDX: ffff8e049fb160b0 RSI: 0000000000000286 RDI: ffff8db53e79bb30
[5951631.057005] RBP: ffff8df549f8fdc0 R08: ffff8db53e79bb00 R09: 0000000000000001
[5951631.057005] R10: ffff8e049fb00000 R11: 000000000000000a R12: 0000000000000000
[5951631.057005] R13: ffff8e247d5d5a34 R14: 0000000000000022 R15: ffff8df49fa9acc0
[5951631.057005] FS:  00007f8de05f1700(0000) GS:ffff8e049fb00000(0000) knlGS:0000000000000000
[5951631.057005] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[5951631.057005] CR2: 00007f56b4007400 CR3: 00000021b5afc000 CR4: 00000000003406e0
[5951631.057005] Call Trace:
[5951631.057005]  [<ffffffffb6135fe0>] ? cpu_stop_should_run+0x50/0x50
[5951631.057005]  [<ffffffffb61362e9>] cpu_stopper_thread+0x99/0x150
[5951631.057005]  [<ffffffffb6785922>] ? __schedule+0x402/0x840
[5951631.057005]  [<ffffffffb60cf0e4>] smpboot_thread_fn+0x144/0x1a0
[5951631.057005]  [<ffffffffb60cefa0>] ? lg_double_unlock+0x40/0x40
[5951631.057005]  [<ffffffffb60c6691>] kthread+0xd1/0xe0
[5951631.057005]  [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
[5951631.057005]  [<ffffffffb6792d24>] ret_from_fork_nospec_begin+0xe/0x21
[5951631.057005]  [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
[5951631.057005] Code: 66 90 66 90 49 89 c5 48 8b 47 18 48 85 c0 0f 84 b3 00 00 00 0f a3 18 19 db 85 db 41 0f 95 c6 45 31 ff 31 c0 0f 1f 44 00 00 f3 90 <41> 8b 5c 24 20 39 c3 74 5d 83 fb 02 74 68 83 fb 03 75 05 45 84 

Environment

  • Red Hat Enterprise Linux 7.8 (kernel-3.10.0-1127.el7)
  • RHEL guest running on MS Hyper-V hypervisor

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content