Server becomes unresponsive frequently or softlockup when ip_recv_error() got interrupted by hard IRQ
Issue
- Server becomes unresponsive frequently:
PID: 1234 TASK: ffff8f0d121ae300 CPU: 5 COMMAND: "task"
#0 [ffff8f189ff48e48] crash_nmi_callback at ffffffff83258597
#1 [ffff8f189ff48e58] nmi_handle at ffffffff8398d93c
#2 [ffff8f189ff48eb0] do_nmi at ffffffff8398db5d
#3 [ffff8f189ff48ef0] end_repeat_nmi at ffffffff8398cd9c
[exception RIP: native_queued_spin_lock_slowpath+0x1ce]
RIP: ffffffff83317b4e RSP: ffff8f189ff43da0 RFLAGS: 00000002
RAX: 0000000000000001 RBX: 0000000000000046 RCX: 0000000000000001
RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffff8f189e471b1c
RBP: ffff8f189ff43da0 R8: 0000000000000101 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f0d06409e00
R13: ffff8f189e471b1c R14: 0000000000000001 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#4 [ffff8f189ff43da0] native_queued_spin_lock_slowpath at ffffffff83317b4e
#5 [ffff8f189ff43da8] queued_spin_lock_slowpath at ffffffff8397dd33
#6 [ffff8f189ff43db8] _raw_spin_lock_irqsave at ffffffff8398bb67
#7 [ffff8f189ff43dd0] skb_queue_tail at ffffffff8383ee50
#8 [ffff8f189ff43df8] sock_queue_err_skb at ffffffff8383ef1f
#9 [ffff8f189ff43e20] skb_tstamp_tx at ffffffff8384115d
#10 [ffff8f189ff43e48] i40e_ptp_tx_hwtstamp at ffffffffc03adcdd [i40e]
#11 [ffff8f189ff43e78] i40e_intr at ffffffffc0383567 [i40e]
#12 [ffff8f189ff43eb0] __handle_irq_event_percpu at ffffffff833502f4
#13 [ffff8f189ff43ef8] handle_irq_event_percpu at ffffffff833504a2
#14 [ffff8f189ff43f28] handle_irq_event at ffffffff8335052c
#15 [ffff8f189ff43f50] handle_edge_irq at ffffffff8335331f
#16 [ffff8f189ff43f70] handle_irq at ffffffff8322f5f4
#17 [ffff8f189ff43fb8] do_IRQ at ffffffff8399a8cd
--- <IRQ stack> ---
#18 [ffff8f0d2059fa18] ret_from_intr at ffffffff8398c36a
[exception RIP: _raw_spin_lock_bh+0x23]
RIP: ffffffff8398bb13 RSP: ffff8f0d2059fac8 RFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffffff8344b13a RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8f0d2059fb00 RDI: ffff8f189e471b1c
RBP: ffff8f0d2059fb40 R8: ffff8f0d205a0000 R9: 00000000000001a0
R10: 0000000000000002 R11: 0000000000000000 R12: 00000000f77fd4f0
R13: ffffffff83834d5c R14: ffff8f0d2059fac0 R15: ffff8f0d2059fa60
ORIG_RAX: ffffffffffffff58 CS: 0010 SS: 0018
#19 [ffff8f0d2059fac8] ip_recv_error at ffffffff838ab683
#20 [ffff8f0d2059fb48] udp_recvmsg at ffffffff838d3dd2
#21 [ffff8f0d2059fbb8] inet_recvmsg at ffffffff838dfff0
#22 [ffff8f0d2059fbe8] sock_recvmsg at ffffffff838364f5
#23 [ffff8f0d2059fd58] ___sys_recvmsg at ffffffff83837673
#24 [ffff8f0d2059fed0] __sys_recvmsg at ffffffff83838b91
#25 [ffff8f0d2059ff40] sys_recvmsg at ffffffff83838be2
#26 [ffff8f0d2059ff50] tracesys at ffffffff83996226 (via system_call)
RIP: 00007f2b072babad RSP: 00007f2af77fd490 RFLAGS: 00000293
RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: ffffffffffffffff
RDX: 0000000000002040 RSI: 00007f2af77fde50 RDI: 000000000000000f
RBP: 00007f2af00585a0 R8: 00007f2af77fdeb0 R9: 000000000023cae4
R10: 0000000000000000 R11: 0000000000000293 R12: c4893b50d113593d
R13: 000000000000401d R14: 0000000000000036 R15: 00007f2af8001880
ORIG_RAX: 000000000000002f CS: 0033 SS: 002b
Environment
- Red Hat Enterprise Linux (RHEL) 7.9
- kernel-3.10.0-1160.53.1.el7
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.