soft/hard lockups/LOCKUPs happened on the server and then the ppc64le kernel crashed
Issue
- soft/hard lockups/LOCKUPs happened on the server and then the ppc64le kernel crashed. The kernel ring buffer (dmesg) are messed up with massive printk() for message printing and stack dumping:
crash> log | grep -e soft\ lockup -e hard\ LOCKUP
[15975.170716] watchdog: CPU 57 detected hard LOCKUP on other CPUs 4-7,52-55,64-75,81-83,95,108-110
[16010.788162] watchdog: BUG: soft lockup - CPU#61 stuck for 37s! [swapper/61:0]
[16010.788167] watchdog: BUG: soft lockup - CPU#58 stuck for 37s! [swapper/58:0]
[16010.788170] watchdog: BUG: soft lockup - CPU#56 stuck for 37s! [swapper/56:0]
[16010.788179] watchdog: BUG: soft lockup - CPU#130 stuck for 37s! [swapper/130:0]
[16010.788184] watchdog: BUG: soft lockup - CPU#159 stuck for 37s! [swapper/159:0]
[16010.788188] watchdog: BUG: soft lockup - CPU#157 stuck for 37s! [package-server:30143]
[16010.788196] watchdog: BUG: soft lockup - CPU#156 stuck for 37s! [swapper/156:0]
[16010.788207] watchdog: BUG: soft lockup - CPU#107 stuck for 37s! [swapper/107:0]
[16010.788212] watchdog: BUG: soft lockup - CPU#104 stuck for 37s! [swapper/104:0]
[16010.788216] watchdog: BUG: soft lockup - CPU#59 stuck for 37s! [swapper/59:0]
[16010.788233] watchdog: BUG: soft lockup - CPU#150 stuck for 37s! [swapper/150:0]
[16010.788238] watchdog: BUG: soft lockup - CPU#154 stuck for 37s! [swapper/154:0]
[16010.788249] watchdog: BUG: soft lockup - CPU#151 stuck for 37s! [swapper/151:0]
[16010.788269] watchdog: BUG: soft lockup - CPU#105 stuck for 37s! [swapper/105:0]
[16010.788280] watchdog: BUG: soft lockup - CPU#155 stuck for 37s! [swapper/155:0]
[16010.788299] watchdog: BUG: soft lockup - CPU#106 stuck for 37s! [swapper/106:0]
[16010.788321] watchdog: BUG: soft lockup - CPU#134 stuck for 37s! [swapper/134:0]
[16010.788325] watchdog: BUG: soft lockup - CPU#131 stuck for 37s! [swapper/131:0]
[16010.788335] watchdog: BUG: soft lockup - CPU#135 stuck for 37s! [swapper/135:0]
[16010.788339] watchdog: BUG: soft lockup - CPU#132 stuck for 37s! [swapper/132:0]
[16010.788379] watchdog: BUG: soft lockup - CPU#153 stuck for 37s! [swapper/153:0]
[16010.788383] watchdog: BUG: soft lockup - CPU#129 stuck for 37s! [swapper/129:0]
[16010.788405] watchdog: BUG: soft lockup - CPU#60 stuck for 37s! [swapper/60:0]
[16010.788425] watchdog: BUG: soft lockup - CPU#152 stuck for 37s! [swapper/152:0]
[16010.788442] watchdog: BUG: soft lockup - CPU#149 stuck for 37s! [swapper/149:0]
[16010.788472] watchdog: BUG: soft lockup - CPU#62 stuck for 37s! [swapper/62:0]
[16010.788496] watchdog: BUG: soft lockup - CPU#148 stuck for 37s! [swapper/148:0]
[16010.788609] watchdog: BUG: soft lockup - CPU#57 stuck for 37s! [coredns:31186]
[16010.788709] watchdog: BUG: soft lockup - CPU#128 stuck for 37s! [swapper/128:0]
[16010.788994] watchdog: BUG: soft lockup - CPU#63 stuck for 37s! [swapper/63:0]
[16010.790164] watchdog: BUG: soft lockup - CPU#133 stuck for 37s! [swapper/133:0]
[16011.251947] watchdog: CPU 158 detected hard LOCKUP on other CPUs 0-3,8-51,56-63,76-80,84-94,96-107,111-157,159-175
...
[16185.145146] kexec: waiting for cpu 169 (physical 2137) to enter OPAL
[16186.164574] kexec: timed out waiting for cpu 169 (physical 2137) to enter OPAL
[16186.164594] kexec: waiting for cpu 170 (physical 2138) to enter OPAL
[16187.184040] kexec: timed out waiting for cpu 170 (physical 2138) to enter OPAL
[16187.184060] kexec: waiting for cpu 171 (physical 2139) to enter OPAL
[16188.203531] kexec: timed out waiting for cpu 171 (physical 2139) to enter OPAL
[16188.203551] kexec: waiting for cpu 172 (physical 2140) to enter OPAL
[16189.223043] kexec: timed out waiting for cpu 172 (physical 2140) to enter OPAL
[16189.223063] kexec: waiting for cpu 173 (physical 2141) to enter OPAL
[16190.242575] kexec: timed out waiting for cpu 173 (physical 2141) to enter OPAL
[16190.242595] kexec: waiting for cpu 174 (physical 2142) to enter OPAL
[16191.262127] kexec: timed out waiting for cpu 174 (physical 2142) to enter OPAL
[16191.262147] kexec: waiting for cpu 175 (physical 2143) to enter OPAL
[16192.281701] kexec: timed out waiting for cpu 175 (physical 2143) to enter OPAL
[16193.411345] kexec: Starting switchover sequence.
Environment
- Red Hat Enterprise Linux for Power, little endian 8.4.z
- kernel-4.18.0-305.72.1.el8_4.ppc64le
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.