ppc64le kernel is getting hung up with soft lockups and rcu_sched CPU stalls. One CPU is stuck waiting on rq.lock spinlock of another CPU but the spinlock is not locked

Solution Unverified - Updated -

Issue

  • ppc64le kernel is getting hung up with soft lockups and rcu_sched CPU stalls. One CPU is stuck waiting on rq.lock spinlock of another CPU but the spinlock is not locked
[3849417.502681] watchdog: BUG: soft lockup - CPU#19 stuck for 23s! [migration/19:127]
[3849417.502702] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_tables_set rpadlpar_io rpaphp mptcp_diag xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag nfsv3 nfs_acl nfs lockd grace fscache bonding nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink sunrpc pseries_rng xts vmx_crypto binfmt_misc ext4 mbcache jbd2 dm_service_time sr_mod cdrom sd_mod t10_pi sg ibmvfc ibmveth scsi_transport_fc ibmvscsi scsi_transport_srp dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: nft_reject]
[3849417.502754] CPU: 19 PID: 127 Comm: migration/19 Kdump: loaded Not tainted 4.18.0-305.65.1.el8_4.ppc64le #1
[3849417.502757] NIP:  c0000000002b25bc LR: c0000000002b2680 CTR: c0000000002b2520
[3849417.502761] REGS: c0000017fcb6b990 TRAP: 0901   Not tainted  (4.18.0-305.65.1.el8_4.ppc64le)
[3849417.502762] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002422  XER: 20040000
[3849417.502768] CFAR: c0000000002b2684 IRQMASK: 0 
                 GPR00: c0000000002b2680 c0000017fcb6bc20 c000000001c11500 0000000000000100 
                 GPR04: c000001fc7ef38e8 c000001fc7ef38e8 0000000000000000 c0000017feffad58 
                 GPR08: 0000000000000004 c0000017fcbfc000 00000000056083d2 0000000000000001 
                 GPR12: 0000000000000000 c0000017ffffb700 
[3849417.502784] NIP [c0000000002b25bc] multi_cpu_stop+0x9c/0x220
[3849417.502787] LR [c0000000002b2680] multi_cpu_stop+0x160/0x220
[3849417.502789] Call Trace:
[3849417.502792] [c0000017fcb6bc20] [c0000017fcb6bc90] 0xc0000017fcb6bc90 (unreliable)
[3849417.502795] [c0000017fcb6bc90] [c0000000002b28ec] cpu_stopper_thread+0x14c/0x240
[3849417.502798] [c0000017fcb6bd40] [c0000000001ab5f8] smpboot_thread_fn+0x1e8/0x2a0
[3849417.502802] [c0000017fcb6bdb0] [c0000000001a3520] kthread+0x1b0/0x1c0
[3849417.502806] [c0000017fcb6be20] [c00000000000b7d8] ret_from_kernel_thread+0x5c/0x64
    ...
[3849429.512727] watchdog: BUG: soft lockup - CPU#48 stuck for 22s! [lparstat:103399]
[3849429.512749] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_tables_set rpadlpar_io rpaphp mptcp_diag xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag nfsv3 nfs_acl nfs lockd grace fscache bonding nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink sunrpc pseries_rng xts vmx_crypto binfmt_misc ext4 mbcache jbd2 dm_service_time sr_mod cdrom sd_mod t10_pi sg ibmvfc ibmveth scsi_transport_fc ibmvscsi scsi_transport_srp dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: nft_reject]
[3849429.512809] CPU: 48 PID: 103399 Comm: lparstat Kdump: loaded Tainted: G             L   --------- -  - 4.18.0-305.65.1.el8_4.ppc64le #1
[3849429.512813] NIP:  c000000000273804 LR: c000000000273824 CTR: c0000000000bf1b0
[3849429.512816] REGS: c0000016846af810 TRAP: 0901   Tainted: G             L   --------- -  -  (4.18.0-305.65.1.el8_4.ppc64le)
[3849429.512819] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 44024244  XER: 00000000
[3849429.512823] CFAR: c000000000273810 IRQMASK: 0 
                 GPR00: c000000000273824 c0000016846afaa0 c000000001c11500 0000000000000008 
                 GPR04: 0000000000000008 0000000000000008 0000000000000037 c000000001c42e00 
                 GPR08: 0000000000000008 0000000000000001 c000000ffedeee00 c000000001c47058 
                 GPR12: c0000000000bf1b0 c0000017ffff5300 
[3849429.512840] NIP [c000000000273804] smp_call_function_many_cond+0x464/0x4e0
[3849429.512842] LR [c000000000273824] smp_call_function_many_cond+0x484/0x4e0
[3849429.512844] Call Trace:
[3849429.512847] [c0000016846afaa0] [c000000000273824] smp_call_function_many_cond+0x484/0x4e0 (unreliable)
[3849429.512849] [c0000016846afb30] [c000000000273958] on_each_cpu+0x58/0xb0
[3849429.512853] [c0000016846afb70] [c00000000011d904] pseries_lparcfg_data.isra.0+0xa74/0x1040
[3849429.512857] [c0000016846afcf0] [c00000000059390c] seq_read+0x1cc/0x720
[3849429.512860] [c0000016846afd90] [c00000000062a3e0] proc_reg_read+0x90/0x1a0
[3849429.512863] [c0000016846afdc0] [c000000000545ef8] sys_read+0x118/0x320
[3849429.512867] [c0000016846afe20] [c00000000000b408] system_call+0x5c/0x70
    ...
PID: 770004   TASK: c000000c25b0d800  CPU: 8    COMMAND: "GC Thread#49"
 R0:  0000000044022482    R1:  c000000be2183340    R2:  c000000001c11500   
 R3:  0000000000000000    R4:  0000000000000038    R5:  0000000027b2cd29   
 R6:  000000000000003f    R7:  c000000001c46e00    R8:  00000000000001c0   
 R9:  c000001fffff2700    R10: 0000000080000038    R11: 0000002000000014   
 R12: 0000000000000000    R13: c000000ffffff300    R14: c000000001c42e00   
 R15: 00000000000001f8    R16: 000000000000003f    R17: c000000be21836f0   
 R18: 0000000000000002    R19: 0000000000000003    R20: 0000000000000004   
 R21: c000000c18e5d000    R22: 0000000000000001    R23: 0000000000000005   
 R24: c000000001dfdb80    R25: c000000fa2308e00    R26: c000000001c47514   
 R27: c000000fa2308e30    R28: c000000001c42e00    R29: 0000000000000001   
 R30: 0000000000000001    R31: c000001ffda69d80   
 NIP: c000000000104ea8    MSR: 8000000000181033    OR3: 000000000000011c
 CTR: 0000000000000000    LR:  c0000000000b5684    XER: 0000000000000000
 CCR: 0000000024022482    MQ:  0000000000000001    DAR: 000000000000000c
 DSISR: 0000000000000000     Syscall Result: 0000000000000000
 [NIP  : plpar_hcall_norets+28]
 [LR   : __spin_yield+148]
 #0 [c000000be2183340] (null) at c000000be2183370  (unreliable)
 #1 [c000000be21833a0] _raw_spin_lock_irqsave at c000000000ee5708
 #2 [c000000be21833e0] update_blocked_averages at c0000000001c6f80
 #3 [c000000be2183480] find_busiest_group at c0000000001db070
 #4 [c000000be2183660] load_balance at c0000000001db7ec
 #5 [c000000be21837f0] newidle_balance at c0000000001dd3b0
 #6 [c000000be21838b0] pick_next_task_fair at c0000000001dd85c
 #7 [c000000be2183960] __schedule at c000000000edd984
 #8 [c000000be2183a30] schedule at c000000000ede2a8
 #9 [c000000be2183a60] futex_wait_queue_me at c00000000026d3b8
#10 [c000000be2183ab0] futex_wait at c00000000026da08
#11 [c000000be2183c00] do_futex at c000000000271350
#12 [c000000be2183d90] sys_futex at c000000000272374
#13 [c000000be2183e20] system_call at c00000000000b408
 System Call [c00] exception frame:
 R0:  00000000000000dd    R1:  00007fff2a38e140    R2:  00007fff9a937f00   
 R3:  000000014b79c078    R4:  0000000000000080    R5:  0000000000000000   
 R6:  0000000000000000    R7:  00007fff2a38f278    R8:  0000000000000002   
 R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000   
 R12: 0000000000000000    R13: 00007fff2a3968e0    R14: 0000000000000000   
 R15: 0000000000000001    R16: 0000000000000000    R17: 000000014b79c078   
 R18: 0000000000000000    R19: 00007fff2a38e198    R20: 00007fff9a9014b0   
 R21: 0000000000000080    R22: 0000000000107f3e    R23: 0000000000000000   
 R24: 000000000020fe7c    R25: 000000014b79c028    R26: 00007fff2a38e178   
 R27: 0000000000000000    R28: 0000000000000002    R29: 000000014b79c078   
 R30: 000000014b79c050    R31: 000000014b79c060   
 NIP: 00007fff9a90174c    MSR: 800000000280f033    OR3: 000000014b79c078
 CTR: 0000000000000000    LR:  00007fff9a90172c    XER: 0000000000000000
 CCR: 0000000044024888    MQ:  0000000000000000    DAR: 00007fff3d94db30
 DSISR: 0000000008000000     Syscall Result: 0000000000000000

Environment

  • Red Hat Enterprise Linux 8.4.z for Power, Little Endian

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content