A hard LOCKUP happens on one CPU that is stuck waiting on rq.lock spinlock. The spinlock is being held by another CPU that is just carrying out context switching.

Solution Unverified - Updated -

Issue

  • A hard LOCKUP happens on one CPU that is stuck waiting on rq.lock spinlock. The spinlock is being held by another CPU that is just carrying out context switching.
[183211.656894] NMI watchdog: Watchdog detected hard LOCKUP on cpu 42 Modules linked in: [...]
    ...
[183211.656964] CPU: 42 PID: 0 Comm: swapper/42 Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-372.57.1.el8_6.x86_64 #1
[183211.656966] Hardware name: Dell Inc. PowerEdge FC630/0R10KJ, BIOS 2.16.0 10/27/2022
[183211.656967] RIP: 0010:native_queued_spin_lock_slowpath+0x144/0x1c0
[183211.656968] Code: c1 e9 12 83 e0 03 83 e9 01 48 c1 e0 05 48 63 c9 48 05 c0 bb 02 00 48 03 04 cd 20 38 fa a7 48 89 10 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 74 f7 48 8b 02 48 85 c0 74 22 48 89 c1 0f 0d 08 eb
[183211.656969] RSP: 0018:ffffa69d995a87f0 EFLAGS: 00000046
[183211.656970] RAX: 0000000000000000 RBX: ffff8bd68a7bd000 RCX: 0000000000000000
[183211.656971] RDX: ffff8c30ff96bbc0 RSI: 0000000000ac0000 RDI: ffff8c30ff4aae80
[183211.656972] RBP: ffff8c30ff4aae80 R08: ffff8c30ff940000 R09: ffffa69db0fdbd10
[183211.656973] R10: ffff8bd601c28218 R11: 0000000000000000 R12: 0000000000000000
[183211.656974] R13: ffff8bd68a7bdbbc R14: 0000000000000087 R15: 0000000000000004
[183211.656975] FS:  0000000000000000(0000) GS:ffff8c30ff940000(0000) knlGS:0000000000000000
[183211.656976] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[183211.656977] CR2: 000000c000e7f000 CR3: 0000004556a10004 CR4: 00000000003706e0
[183211.656977] Call Trace:
[183211.656978]  <IRQ>
[183211.656979]  _raw_spin_lock+0x1e/0x30
[183211.656980]  try_to_wake_up+0x15d/0x4e0
[183211.656980]  pollwake+0x74/0xa0
[183211.656981]  ? wake_up_q+0x70/0x70
[183211.656982]  __wake_up_common+0x7a/0x190
[183211.656983]  __wake_up_common_lock+0x7c/0xc0
[183211.656984]  ep_poll_callback+0x103/0x2c0
[183211.656985]  __wake_up_common+0x7a/0x190
[183211.656985]  __wake_up_common_lock+0x7c/0xc0
[183211.656986]  sock_def_readable+0x37/0x70
[183211.656987]  __netlink_sendskb+0x3d/0x50
[183211.656988]  netlink_unicast+0x20e/0x230
[183211.656988]  queue_userspace_packet+0x4b3/0x5f0 [openvswitch]
[183211.656989]  ovs_dp_upcall+0x78/0x100 [openvswitch]
[183211.656990]  ovs_dp_process_packet+0x157/0x200 [openvswitch]
[183211.656991]  ovs_vport_receive+0x6c/0xc0 [openvswitch]
[183211.656992]  ? lin_nf_packet_wrapper.isra.18.constprop.22+0x1c4/0x430 [dsa_filter]
[183211.656993]  ? lin_pkt_read_start+0x70/0x70 [dsa_filter]
[183211.656994]  ? update_group_capacity+0x25/0x220
[183211.656994]  ? cpumask_next_and+0x1a/0x20
[183211.656995]  ? update_sd_lb_stats.constprop.120+0xca/0x840
[183211.656996]  netdev_frame_hook+0xc0/0x180 [openvswitch]
[183211.656997]  __netif_receive_skb_core+0x2d6/0xcc0
[183211.656997]  ? load_balance+0x144/0xc70
[183211.656998]  process_backlog+0xaa/0x170
[183211.656999]  __napi_poll+0x2d/0x130
[183211.656999]  net_rx_action+0x257/0x320
[183211.657000]  ? __note_gp_changes+0x166/0x170
[183211.657001]  __do_softirq+0xd7/0x2c8
[183211.657001]  irq_exit_rcu+0xd7/0xe0
[183211.657002]  irq_exit+0xa/0x10
[183211.657003]  smp_apic_timer_interrupt+0x74/0x130
[183211.657004]  apic_timer_interrupt+0xf/0x20
[183211.657004]  </IRQ>
[183211.657005] RIP: 0010:cpuidle_enter_state+0xda/0x3d0
[183211.657006] Code: e8 1b 1e 9a ff 80 7c 24 0f 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 aa 02 00 00 31 ff e8 1d 29 a1 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 29 01 00 00 49 63 d6 48 8b 4c 24 10 48 2b 0c 24 48
[183211.657007] RSP: 0018:ffffa69d98c2be58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[183211.657009] RAX: ffff8c30ff96ae80 RBX: ffffffffa8ab89d8 RCX: 000000000000001f
[183211.657009] RDX: 0000a69e8e998c42 RSI: 0000000037a70fd4 RDI: 0000000000000000
[183211.657010] RBP: ffffc63d7f940508 R08: 0000000000000002 R09: 000000000002a6c0
[183211.657011] R10: 001deaee09e00a98 R11: ffff8c30ff969b84 R12: 0000000000000004
[183211.657012] R13: ffffffffa8ab8800 R14: 0000000000000004 R15: 0000000000000004
[183211.657013]  ? cpuidle_enter_state+0xb5/0x3d0
[183211.657013]  cpuidle_enter+0x2c/0x40
[183211.657014]  do_idle+0x268/0x2d0
[183211.657015]  cpu_startup_entry+0x6f/0x80
[183211.657016]  start_secondary+0x1a6/0x1f0
[183211.657016]  secondary_startup_64_no_verify+0xc2/0xcb
[183211.657017] Kernel panic - not syncing: Hard LOCKUP

Environment

  • Red Hat Enterprise Linux 8.6.z - kernel-4.18.0-372.57.1.el8_6
  • Dell PowerEdge series

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content