[RHEL-7.2] BUG: soft lockup - CPU#11 stuck for 23s! [migration/xx:yyy]

Solution Verified - Updated -

Issue

Migration thread soft lockup messages present on server and CPU hang

  • x86_64
[  392.666057] BUG: soft lockup - CPU#31 stuck for 22s! [migration/31:420]\ <<<<<<<<<<<<<<<<<<
[  392.666057] Modules linked in: isofs adm1021 lm90 nouveau mxm_wmi wmi video ppdev           i2c_algo_bit ttm intel_rapl parport_pc parport drm_kms_helper drm i2c_piix4 crct10dif_pclmul   crct10dif_common crc32_pclmul ghash_clmulni_intel i2c_core a
esni_intel lrw gf128mul pcspkr glue_helper ablk_helper serio_raw cryptd xfs libcrc32c          ata_generic pata_acpi xen_netfront xen_blkfront ata_piix libata crc32c_intel floppy
[  392.666057] CPU: 31 PID: 420 Comm: migration/31 Not tainted 3.10.0-229.el7.x86_64 #1
[  392.666057] Hardware name: Xen HVM domU, BIOS 4.2.amazon 10/16/2015
[  392.666057] task: ffff880f158c6660 ti: ffff880f15928000 task.ti: ffff880f15928000
[  392.666057] RIP: 0010:[<ffffffff8107fd74>]  [<ffffffff8107fd74>] run_timer_softirq+0x1c4/   0x320
[  392.666057] RSP: 0018:ffff880f20fe3eb8  EFLAGS: 00000282
[  392.666057] RAX: 0000000000000000 RBX: ffffffffffffff0c RCX: 000000000000001f
[  392.666057] RDX: 00000000fffd5f69 RSI: 00000000a06aa068 RDI: ffffffff81903088
[  392.666057] RBP: ffff880f20fe3ed0 R08: ffff880f20fe3e38 R09: 00000000000002e0
[  392.666057] R10: ffff880f20fefe4c R11: 0000000000000003 R12: ffff880f20fe3e28
[  392.666057] R13: ffffffff8161586d R14: ffff880f20fe3ed0 R15: 0000000000000001
[  392.666057] FS:  0000000000000000(0000) GS:ffff880f20fe0000(0000) knlGS:0000000000000000
[  392.666057] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  392.666057] CR2: 00007fa63d0a87b0 CR3: 000000000190a000 CR4: 00000000000406e0
[  392.666057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  392.666057] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  392.666057] Stack:
[  392.666057]  ffff8807913e7c01 ffffffff81903088 0000000000000001 ffff880f20fe3f40
[  392.666057]  ffffffff81077b2f ffff880f1592bfd8 0000000a0420a040 00000000fffd5f6b
[  392.666057]  0000000000000001 ffff880f1592bfd8 ffff880f1592bfd8 000001000000001f
[  392.666057] Call Trace:
[  392.666057]  <IRQ>
[  392.666057]  [<ffffffff81077b2f>] __do_softirq+0xef/0x280
[  392.666057]  [<ffffffff816156dc>] call_softirq+0x1c/0x30
[  392.666057]  [<ffffffff81015d95>] do_softirq+0x65/0xa0
[  392.666057]  [<ffffffff81077ec5>] irq_exit+0x115/0x120
[  392.666057]  [<ffffffff813815a5>] xen_evtchn_do_upcall+0x35/0x50
[  392.666057]  [<ffffffff8161586d>] xen_hvm_callback_vector+0x6d/0x80
[  392.666057]  <EOI>
[  392.666057]  [<ffffffff810f26dd>] ? multi_cpu_stop+0x7d/0xf0
[  392.666057]  [<ffffffff810f2660>] ? cpu_stop_should_run+0x50/0x50
[  392.666057]  [<ffffffff810f28e8>] cpu_stopper_thread+0x88/0x160
[  392.666057]  [<ffffffff81608d48>] ? __schedule+0x2d8/0x7c0
[  392.666057]  [<ffffffff8109fc7f>] smpboot_thread_fn+0xff/0x1a0
[  392.666057]  [<ffffffff81609259>] ? schedule+0x29/0x70
[  392.666057]  [<ffffffff8109fb80>] ? lg_global_unlock+0xc0/0xc0
[  392.666057]  [<ffffffff8109726f>] kthread+0xcf/0xe0
[  392.666057]  [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[  392.666057]  [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0
[  392.666057]  [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[  392.666057] Code: 00 e9 2e 01 00 00 66 83 03 02 fb 66 66 90 66 66 90 48 8b 45 d0 65 48 33   04 25 28 00 00 00 0f 85 4f 01 00 00 48 83 c4 40 5b 41 5c <41> 5d 41 5e 41 5f 5d c3 0f 1f 40 00 4c 8b 25 09 8c 97 00 4d 85
  • ppc64
[103532.492560] BUG: soft lockup - CPU#11 stuck for 23s! [migration/11:322]
[ 5078.174544] Non critical power or cooling issue cleared
[103532.492578] Modules linked in:
[103532.492581]  pseries_energy fuse btrfs raid6_pq xor vfat msdos fat xfs libcrc32c bridge    stp llc bonding uinput pseries_rng nx_crypto xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt       ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ses enclosure      binfmt_misc ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_common iw_cxgb3 ib_core ib_addr ipr  cxgb3 libata mdio dm_mirror dm_region_hash dm_log dm_mod
[103532.492640] CPU: 11 PID: 322 Comm: migration/11 Not tainted 3.10.0-229.el7.ppc64 #1
[103532.492643] task: c000000f6717df50 ti: c000000f674f4000 task.ti: c000000f674f4000
[103532.492646] NIP: c000000000189118 LR: c0000000001894f8 CTR: c0000000001890a0
[103532.492648] REGS: c000000f674f7820 TRAP: 0901   Not tainted  (3.10.0-229.el7.ppc64)
[103532.492650] MSR: 8000000100009032 <SF,EE,ME,IR,DR,RI>  CR: 24000088  XER: 20000000
[103532.492656] CFAR: c000000000189190 SOFTE: 1
                GPR00: c0000000001894f8 c000000f674f7aa0 c00000000130ae00 c000000d02253640
                GPR04: c000000d02253668 c000000d02253668 0000000000000000 0000000000000000
                GPR08: c000000d02253664 0000000000000001 0000000000000001 0000000000000003
                GPR12: 0000000024000028 c000000007b26300
[103532.492675] NIP [c000000000189118] .multi_cpu_stop+0x78/0x290
[103532.492678] LR [c0000000001894f8] .cpu_stopper_thread+0xd8/0x1f0
[103532.492680] Call Trace:
[103532.492684] [c000000f674f7aa0] [c000000000113080] .complete+0xb0/0x130 (unreliable)
[103532.492687] [c000000f674f7b40] [c0000000001894f8] .cpu_stopper_thread+0xd8/0x1f0
[103532.492690] [c000000f674f7c80] [c00000000010c748] .smpboot_thread_fn+0x228/0x280
[103532.492693] [c000000f674f7d30] [c0000000000fe528] .kthread+0xe8/0xf0
[103532.492697] [c000000f674f7e30] [c00000000000a464] .ret_from_kernel_thread+0x58/0x74
[103532.492699] Instruction dump:
[103532.492701] 7d29502a 7d3ef436 7bde07e0 2fbe0000 409e0128 39400000 38c00000 391f0024
[103532.492705] 60000000 60420000 7c210b78 7c421378 <813f0020> 7f895040 2b090002 419e0068
PID: 322    TASK: c000000f6717df50  CPU: 11  COMMAND: "migration/11"
 #0 [c000000f674f75f0] .crash_ipi_callback+0x104 at c00000000004fd64
 #1 [c000000f674f7680] .die+0x354 at c000000000020a54
 #2 [c000000f674f7730] .system_reset_exception+0x5c at c000000000020dec
 #3 [c000000f674f77b0] system_reset_common+0x108 at c000000000002488
 System Reset [100] exception frame:
 R0:  c0000000001894f8    R1:  c000000f674f7aa0    R2:  c00000000130ae00
 R3:  c000000d02253640    R4:  c000000d02253668    R5:  c000000d02253668
 R6:  0000000000000000    R7:  0000000000000000    R8:  c000000d02253664
 R9:  0000000000000001    R10: 0000000000000001    R11: 0000000000000003
 R12: 0000000024000028    R13: c000000007b26300    R14: c0000000000fe440
 R15: c000000f6a907880    R16: 0000000000000000    R17: 0000000000000000
 R18: 0000000000000000    R19: 0000000000000000    R20: 0000000000000000
 R21: 0000000000000000    R22: 0000000000000000    R23: 0000000000000000
 R24: 0000000000000001    R25: c000000f674f4000    R26: c000000001ac8820
 R27: 0000000000000000    R28: 0000000000000001    R29: c0000000012ba5d8
 R30: 0000000000000000    R31: c000000d02253640
 NIP: c000000000189118    MSR: 8000000100089032    OR3: 000000000000011c
 CTR: c0000000001890a0    LR:  c0000000001894f8    XER: 0000000020000000
 CCR: 0000000024000088    MQ:  0000000000000001    DAR: 0000000000000000
 DSISR: c000000f674f7a00     Syscall Result: 0000000000000000
 #4 [c000000f674f7aa0] .multi_cpu_stop+0x78 at c000000000189118
 [Link Register] [c000000f674f7aa0] .cpu_stopper_thread at c0000000001894f8
 #5 [c000000f674f7b40] .cpu_stopper_thread+0xd8 at c0000000001894f8  (unreliable)
 #6 [c000000f674f7c80] .smpboot_thread_fn+0x228 at c00000000010c748
 #7 [c000000f674f7d30] .kthread+0xe8 at c0000000000fe528
 #8 [c000000f674f7e30] .ret_from_kernel_thread+0x58 at c00000000000a464

Environment

  • Red Hat Enterprise Linux 7.2
  • kernel-3.10.0-327.el7

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content