RHEL guest is getting unresponsive where massive soft lockups and rcu_sched CPU stalls are being encountered.
Issue
- RHEL guest is getting unresponsive where massive soft lockups and rcu_sched CPU stalls are being encountered.
[5951166.620004] INFO: rcu_sched self-detected stall on CPU { 62} (t=158463512 jiffies g=822375595 c=822375594 q=51744570)
...
[5951166.620004] Task dump for CPU 62:
[5951166.620004] migration/62 R running task 0 326 2 0x00000008
[5951166.620004] Call Trace:
[5951166.620004] <IRQ> [<ffffffffb60dab18>] sched_show_task+0xa8/0x110
[5951166.620004] [<ffffffffb60de919>] dump_cpu_task+0x39/0x70
[5951166.620004] [<ffffffffb6158b90>] rcu_dump_cpu_stacks+0x90/0xd0
[5951166.620004] [<ffffffffb615c252>] rcu_check_callbacks+0x442/0x730
[5951166.620004] [<ffffffffb6110ae0>] ? tick_sched_do_timer+0x50/0x50
[5951166.620004] [<ffffffffb60af9d6>] update_process_times+0x46/0x80
[5951166.620004] [<ffffffffb6110850>] tick_sched_handle+0x30/0x70
[5951166.620004] [<ffffffffb6110b19>] tick_sched_timer+0x39/0x80
[5951166.620004] [<ffffffffb60caa8e>] __hrtimer_run_queues+0x10e/0x270
[5951166.620004] [<ffffffffb60cafef>] hrtimer_interrupt+0xaf/0x1d0
[5951166.620004] [<ffffffffc02ea5d4>] hv_stimer0_isr+0x24/0x40 [hv_vmbus]
[5951166.620004] [<ffffffffb679794c>] hv_stimer0_vector_handler+0x3c/0x70
[5951166.620004] [<ffffffffb679699a>] hv_stimer0_callback_vector+0x16a/0x170
[5951166.620004] <EOI> [<ffffffffb613602a>] ? multi_cpu_stop+0x4a/0x110
[5951166.620004] [<ffffffffb6135fe0>] ? cpu_stop_should_run+0x50/0x50
[5951166.620004] [<ffffffffb61362e9>] cpu_stopper_thread+0x99/0x150
[5951166.620004] [<ffffffffb6785922>] ? __schedule+0x402/0x840
[5951166.620004] [<ffffffffb60cf0e4>] smpboot_thread_fn+0x144/0x1a0
[5951166.620004] [<ffffffffb60cefa0>] ? lg_double_unlock+0x40/0x40
[5951166.620004] [<ffffffffb60c6691>] kthread+0xd1/0xe0
[5951166.620004] [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
[5951166.620004] [<ffffffffb6792d24>] ret_from_fork_nospec_begin+0xe/0x21
[5951166.620004] [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
...
[5951631.057005] NMI watchdog: BUG: soft lockup - CPU#44 stuck for 22s! [migration/44:234]
[5951631.057005] Modules linked in: fuse btrfs raid6_pq xor msdos udp_diag binfmt_misc xt_owner iptable_security unix_diag tcp_diag inet_diag xt_REDIRECT nf_nat_redirect iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_addrtype xt_multiport iptable_filter ext4 mbcache jbd2 nvme nvme_core joydev vfat fat crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr hv_utils pci_hyperv ptp sg pps_core hv_balloon i2c_piix4 ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi hv_storvsc scsi_transport_fc hid_hyperv ata_piix hv_netvsc hyperv_keyboard scsi_tgt crct10dif_pclmul hyperv_fb crct10dif_common libata crc32c_intel hv_vmbus floppy serio_raw dm_mirror dm_region_hash dm_log dm_mod
[5951631.057005] CPU: 44 PID: 234 Comm: migration/44 Kdump: loaded Tainted: G L ------------ T 3.10.0-1127.el7.x86_64 #1
[5951631.057005] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 05/18/2018
[5951631.057005] task: ffff8df549f7b150 ti: ffff8df549f8c000 task.ti: ffff8df549f8c000
[5951631.057005] RIP: 0010:[<ffffffffb613602a>] [<ffffffffb613602a>] multi_cpu_stop+0x4a/0x110
[5951631.057005] RSP: 0000:ffff8df549f8fd98 EFLAGS: 00000246
[5951631.057005] RAX: 0000000000000001 RBX: ffffffffb60d79bc RCX: dead000000000200
[5951631.057005] RDX: ffff8e049fb160b0 RSI: 0000000000000286 RDI: ffff8db53e79bb30
[5951631.057005] RBP: ffff8df549f8fdc0 R08: ffff8db53e79bb00 R09: 0000000000000001
[5951631.057005] R10: ffff8e049fb00000 R11: 000000000000000a R12: 0000000000000000
[5951631.057005] R13: ffff8e247d5d5a34 R14: 0000000000000022 R15: ffff8df49fa9acc0
[5951631.057005] FS: 00007f8de05f1700(0000) GS:ffff8e049fb00000(0000) knlGS:0000000000000000
[5951631.057005] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[5951631.057005] CR2: 00007f56b4007400 CR3: 00000021b5afc000 CR4: 00000000003406e0
[5951631.057005] Call Trace:
[5951631.057005] [<ffffffffb6135fe0>] ? cpu_stop_should_run+0x50/0x50
[5951631.057005] [<ffffffffb61362e9>] cpu_stopper_thread+0x99/0x150
[5951631.057005] [<ffffffffb6785922>] ? __schedule+0x402/0x840
[5951631.057005] [<ffffffffb60cf0e4>] smpboot_thread_fn+0x144/0x1a0
[5951631.057005] [<ffffffffb60cefa0>] ? lg_double_unlock+0x40/0x40
[5951631.057005] [<ffffffffb60c6691>] kthread+0xd1/0xe0
[5951631.057005] [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
[5951631.057005] [<ffffffffb6792d24>] ret_from_fork_nospec_begin+0xe/0x21
[5951631.057005] [<ffffffffb60c65c0>] ? insert_kthread_work+0x40/0x40
[5951631.057005] Code: 66 90 66 90 49 89 c5 48 8b 47 18 48 85 c0 0f 84 b3 00 00 00 0f a3 18 19 db 85 db 41 0f 95 c6 45 31 ff 31 c0 0f 1f 44 00 00 f3 90 <41> 8b 5c 24 20 39 c3 74 5d 83 fb 02 74 68 83 fb 03 75 05 45 84
Environment
- Red Hat Enterprise Linux 7.8 (kernel-3.10.0-1127.el7)
- RHEL guest running on MS Hyper-V hypervisor
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.