Server becomes unresponsive frequently or softlockup when ip_recv_error() got interrupted by hard IRQ

Solution Verified - Updated -

Issue

  • Server becomes unresponsive frequently:
PID: 1234   TASK: ffff8f0d121ae300  CPU: 5   COMMAND: "task"
 #0 [ffff8f189ff48e48] crash_nmi_callback at ffffffff83258597
 #1 [ffff8f189ff48e58] nmi_handle at ffffffff8398d93c
 #2 [ffff8f189ff48eb0] do_nmi at ffffffff8398db5d
 #3 [ffff8f189ff48ef0] end_repeat_nmi at ffffffff8398cd9c
    [exception RIP: native_queued_spin_lock_slowpath+0x1ce]
    RIP: ffffffff83317b4e  RSP: ffff8f189ff43da0  RFLAGS: 00000002
    RAX: 0000000000000001  RBX: 0000000000000046  RCX: 0000000000000001
    RDX: 0000000000000101  RSI: 0000000000000001  RDI: ffff8f189e471b1c
    RBP: ffff8f189ff43da0   R8: 0000000000000101   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: ffff8f0d06409e00
    R13: ffff8f189e471b1c  R14: 0000000000000001  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #4 [ffff8f189ff43da0] native_queued_spin_lock_slowpath at ffffffff83317b4e
 #5 [ffff8f189ff43da8] queued_spin_lock_slowpath at ffffffff8397dd33
 #6 [ffff8f189ff43db8] _raw_spin_lock_irqsave at ffffffff8398bb67
 #7 [ffff8f189ff43dd0] skb_queue_tail at ffffffff8383ee50
 #8 [ffff8f189ff43df8] sock_queue_err_skb at ffffffff8383ef1f
 #9 [ffff8f189ff43e20] skb_tstamp_tx at ffffffff8384115d
#10 [ffff8f189ff43e48] i40e_ptp_tx_hwtstamp at ffffffffc03adcdd [i40e]
#11 [ffff8f189ff43e78] i40e_intr at ffffffffc0383567 [i40e]
#12 [ffff8f189ff43eb0] __handle_irq_event_percpu at ffffffff833502f4
#13 [ffff8f189ff43ef8] handle_irq_event_percpu at ffffffff833504a2
#14 [ffff8f189ff43f28] handle_irq_event at ffffffff8335052c
#15 [ffff8f189ff43f50] handle_edge_irq at ffffffff8335331f
#16 [ffff8f189ff43f70] handle_irq at ffffffff8322f5f4
#17 [ffff8f189ff43fb8] do_IRQ at ffffffff8399a8cd
--- <IRQ stack> ---
#18 [ffff8f0d2059fa18] ret_from_intr at ffffffff8398c36a
    [exception RIP: _raw_spin_lock_bh+0x23]
    RIP: ffffffff8398bb13  RSP: ffff8f0d2059fac8  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffffffff8344b13a  RCX: 0000000000000000
    RDX: 0000000000000001  RSI: ffff8f0d2059fb00  RDI: ffff8f189e471b1c
    RBP: ffff8f0d2059fb40   R8: ffff8f0d205a0000   R9: 00000000000001a0
    R10: 0000000000000002  R11: 0000000000000000  R12: 00000000f77fd4f0
    R13: ffffffff83834d5c  R14: ffff8f0d2059fac0  R15: ffff8f0d2059fa60
    ORIG_RAX: ffffffffffffff58  CS: 0010  SS: 0018
#19 [ffff8f0d2059fac8] ip_recv_error at ffffffff838ab683
#20 [ffff8f0d2059fb48] udp_recvmsg at ffffffff838d3dd2
#21 [ffff8f0d2059fbb8] inet_recvmsg at ffffffff838dfff0
#22 [ffff8f0d2059fbe8] sock_recvmsg at ffffffff838364f5
#23 [ffff8f0d2059fd58] ___sys_recvmsg at ffffffff83837673
#24 [ffff8f0d2059fed0] __sys_recvmsg at ffffffff83838b91
#25 [ffff8f0d2059ff40] sys_recvmsg at ffffffff83838be2
#26 [ffff8f0d2059ff50] tracesys at ffffffff83996226 (via system_call)
    RIP: 00007f2b072babad  RSP: 00007f2af77fd490  RFLAGS: 00000293
    RAX: ffffffffffffffda  RBX: 00000000ffffffff  RCX: ffffffffffffffff
    RDX: 0000000000002040  RSI: 00007f2af77fde50  RDI: 000000000000000f
    RBP: 00007f2af00585a0   R8: 00007f2af77fdeb0   R9: 000000000023cae4
    R10: 0000000000000000  R11: 0000000000000293  R12: c4893b50d113593d
    R13: 000000000000401d  R14: 0000000000000036  R15: 00007f2af8001880
    ORIG_RAX: 000000000000002f  CS: 0033  SS: 002b

Environment

  • Red Hat Enterprise Linux (RHEL) 7.9
  • kernel-3.10.0-1160.53.1.el7

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content