The kernel is getting hung up where multiple soft lockups occur in multi_cpu_stop() with per-CPU migration threads. A possible spinlock deadlock or contention in vxio module from Veritas.
Issue
- The kernel is getting hung up where multiple soft lockups occur in multi_cpu_stop() with per-CPU migration threads. A possible spinlock deadlock or contention in vxio module from Veritas.
[ 3252.147006] NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [migration/18:100]
[ 3252.147006] Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill nfs fscache vxfen(POE) vxglm(POE) vxodm(POE) gab(POE) nf_conntrack_ipv4 nf_defrag_ipv4 xt_owner iptable_security xt_conntrack nf_conntrack dmpjbod(POE) dmpap(POE) dmpaa(POE) vxspec(POE) vxio(POE) vxdmp(POE) llt(POE) rdma_cm iw_cm ib_cm ib_core amf(POE) gsch(OE) redirfs(OE) tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag vxcafs(POE) vxportal(POE) fdd(POE) vxfs(POE) veki(POE) ext4 mbcache jbd2 mlx4_en mlx4_core devlink joydev sb_edac iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr i2c_piix4 hv_utils hv_balloon ptp sg pps_core pci_hyperv binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c
[ 3252.147006] sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi hv_storvsc scsi_transport_fc hv_netvsc hyperv_keyboard hid_hyperv scsi_tgt ata_piix hyperv_fb crct10dif_pclmul crct10dif_common libata crc32c_intel hv_vmbus floppy serio_raw dm_mirror dm_region_hash dm_log dm_mod
[ 3252.147006] CPU: 18 PID: 100 Comm: migration/18 Kdump: loaded Tainted: P W OEL ------------ 3.10.0-1127.13.1.el7.x86_64 #1
[ 3252.147006] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 3252.147006] task: ffff8d363ba9a0e0 ti: ffff8d363baa8000 task.ti: ffff8d363baa8000
[ 3252.147006] RIP: 0010:[<ffffffff9f735fef>] [<ffffffff9f735fef>] multi_cpu_stop+0x4f/0x110
[ 3252.147006] RSP: 0000:ffff8d363baabd98 EFLAGS: 00000246
[ 3252.147006] RAX: 0000000000000001 RBX: ffff8d363baabdc0 RCX: dead000000000200
[ 3252.147006] RDX: ffff8d555f2960b0 RSI: 0000000000000286 RDI: ffff8d34911e3b30
[ 3252.147006] RBP: ffff8d363baabdc0 R08: ffff8d34911e3b00 R09: 0000000000000001
[ 3252.147006] R10: ffff8d555f280000 R11: ffff8d355d5fa438 R12: ffff8d363ba9b150
[ 3252.147006] R13: ffff8d555f2dacc0 R14: ffff8d555f29acc0 R15: ffff8d363b04d800
[ 3252.147006] FS: 0000000000000000(0000) GS:ffff8d555f280000(0000) knlGS:0000000000000000
[ 3252.147006] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3252.147006] CR2: 00007f2b823582d4 CR3: 0000002036030000 CR4: 00000000003606e0
[ 3252.147006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3252.147006] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3252.147006] Call Trace:
[ 3252.147006] [<ffffffff9f735fa0>] ? cpu_stop_should_run+0x50/0x50
[ 3252.147006] [<ffffffff9f7362a9>] cpu_stopper_thread+0x99/0x150
[ 3252.147006] [<ffffffff9fd85942>] ? __schedule+0x402/0x840
[ 3252.147006] [<ffffffff9f6cf0e4>] smpboot_thread_fn+0x144/0x1a0
[ 3252.147006] [<ffffffff9f6cefa0>] ? lg_double_unlock+0x40/0x40
[ 3252.147006] [<ffffffff9f6c6691>] kthread+0xd1/0xe0
[ 3252.147006] [<ffffffff9f6c65c0>] ? insert_kthread_work+0x40/0x40
[ 3252.147006] [<ffffffff9fd92d37>] ret_from_fork_nospec_begin+0x21/0x21
[ 3252.147006] [<ffffffff9f6c65c0>] ? insert_kthread_work+0x40/0x40
[ 3252.147006] Code: 89 c5 48 8b 47 18 48 85 c0 0f 84 b3 00 00 00 0f a3 18 19 db 85 db 41 0f 95 c6 45 31 ff 31 c0 0f 1f 44 00 00 f3 90 41 8b 5c 24 20 <39> c3 74 5d 83 fb 02 74 68 83 fb 03 75 05 45 84 f6 75 6e f0 41
[ 3252.641871] LLT INFO V-14-1-10541 llt_send_hb: timer not called for 139 secs (139752 ticks). Send out of context hbs to peers from llt_udp_recv. 40 secs more to go
Environment
- Red Hat Enterprise Linux 7.8.x (kernel-3.10.0-1127.13.1.el7)
- MS Hyper-V guest
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.