soft/hard lockups/LOCKUPs happened on the server and then the ppc64le kernel crashed

Solution Unverified - Updated -

Issue

  • soft/hard lockups/LOCKUPs happened on the server and then the ppc64le kernel crashed. The kernel ring buffer (dmesg) are messed up with massive printk() for message printing and stack dumping:
crash> log | grep -e soft\ lockup -e hard\ LOCKUP
[15975.170716] watchdog: CPU 57 detected hard LOCKUP on other CPUs 4-7,52-55,64-75,81-83,95,108-110
[16010.788162] watchdog: BUG: soft lockup - CPU#61 stuck for 37s! [swapper/61:0]
[16010.788167] watchdog: BUG: soft lockup - CPU#58 stuck for 37s! [swapper/58:0]
[16010.788170] watchdog: BUG: soft lockup - CPU#56 stuck for 37s! [swapper/56:0]
[16010.788179] watchdog: BUG: soft lockup - CPU#130 stuck for 37s! [swapper/130:0]
[16010.788184] watchdog: BUG: soft lockup - CPU#159 stuck for 37s! [swapper/159:0]
[16010.788188] watchdog: BUG: soft lockup - CPU#157 stuck for 37s! [package-server:30143]
[16010.788196] watchdog: BUG: soft lockup - CPU#156 stuck for 37s! [swapper/156:0]
[16010.788207] watchdog: BUG: soft lockup - CPU#107 stuck for 37s! [swapper/107:0]
[16010.788212] watchdog: BUG: soft lockup - CPU#104 stuck for 37s! [swapper/104:0]
[16010.788216] watchdog: BUG: soft lockup - CPU#59 stuck for 37s! [swapper/59:0]
[16010.788233] watchdog: BUG: soft lockup - CPU#150 stuck for 37s! [swapper/150:0]
[16010.788238] watchdog: BUG: soft lockup - CPU#154 stuck for 37s! [swapper/154:0]
[16010.788249] watchdog: BUG: soft lockup - CPU#151 stuck for 37s! [swapper/151:0]
[16010.788269] watchdog: BUG: soft lockup - CPU#105 stuck for 37s! [swapper/105:0]
[16010.788280] watchdog: BUG: soft lockup - CPU#155 stuck for 37s! [swapper/155:0]
[16010.788299] watchdog: BUG: soft lockup - CPU#106 stuck for 37s! [swapper/106:0]
[16010.788321] watchdog: BUG: soft lockup - CPU#134 stuck for 37s! [swapper/134:0]
[16010.788325] watchdog: BUG: soft lockup - CPU#131 stuck for 37s! [swapper/131:0]
[16010.788335] watchdog: BUG: soft lockup - CPU#135 stuck for 37s! [swapper/135:0]
[16010.788339] watchdog: BUG: soft lockup - CPU#132 stuck for 37s! [swapper/132:0]
[16010.788379] watchdog: BUG: soft lockup - CPU#153 stuck for 37s! [swapper/153:0]
[16010.788383] watchdog: BUG: soft lockup - CPU#129 stuck for 37s! [swapper/129:0]
[16010.788405] watchdog: BUG: soft lockup - CPU#60 stuck for 37s! [swapper/60:0]
[16010.788425] watchdog: BUG: soft lockup - CPU#152 stuck for 37s! [swapper/152:0]
[16010.788442] watchdog: BUG: soft lockup - CPU#149 stuck for 37s! [swapper/149:0]
[16010.788472] watchdog: BUG: soft lockup - CPU#62 stuck for 37s! [swapper/62:0]
[16010.788496] watchdog: BUG: soft lockup - CPU#148 stuck for 37s! [swapper/148:0]
[16010.788609] watchdog: BUG: soft lockup - CPU#57 stuck for 37s! [coredns:31186]
[16010.788709] watchdog: BUG: soft lockup - CPU#128 stuck for 37s! [swapper/128:0]
[16010.788994] watchdog: BUG: soft lockup - CPU#63 stuck for 37s! [swapper/63:0]
[16010.790164] watchdog: BUG: soft lockup - CPU#133 stuck for 37s! [swapper/133:0]
[16011.251947] watchdog: CPU 158 detected hard LOCKUP on other CPUs 0-3,8-51,56-63,76-80,84-94,96-107,111-157,159-175

    ...
[16185.145146] kexec: waiting for cpu 169 (physical 2137) to enter OPAL
[16186.164574] kexec: timed out waiting for cpu 169 (physical 2137) to enter OPAL
[16186.164594] kexec: waiting for cpu 170 (physical 2138) to enter OPAL
[16187.184040] kexec: timed out waiting for cpu 170 (physical 2138) to enter OPAL
[16187.184060] kexec: waiting for cpu 171 (physical 2139) to enter OPAL
[16188.203531] kexec: timed out waiting for cpu 171 (physical 2139) to enter OPAL
[16188.203551] kexec: waiting for cpu 172 (physical 2140) to enter OPAL
[16189.223043] kexec: timed out waiting for cpu 172 (physical 2140) to enter OPAL
[16189.223063] kexec: waiting for cpu 173 (physical 2141) to enter OPAL
[16190.242575] kexec: timed out waiting for cpu 173 (physical 2141) to enter OPAL
[16190.242595] kexec: waiting for cpu 174 (physical 2142) to enter OPAL
[16191.262127] kexec: timed out waiting for cpu 174 (physical 2142) to enter OPAL
[16191.262147] kexec: waiting for cpu 175 (physical 2143) to enter OPAL
[16192.281701] kexec: timed out waiting for cpu 175 (physical 2143) to enter OPAL
[16193.411345] kexec: Starting switchover sequence.

Environment

  • Red Hat Enterprise Linux for Power, little endian 8.4.z
    • kernel-4.18.0-305.72.1.el8_4.ppc64le

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content