Hard lockup panic in third-party mlx5 driver

Solution Unverified - Updated -

Issue

  • Kernel panic - not syncing: Hard LOCKUP in RHEL 8 system with third-party Mellanox mlx5 driver. Panic backtrace is:
PANIC: "Kernel panic - not syncing: Hard LOCKUP"
crash> bt
PID: 8781     TASK: ff399479f9bc3c80  CPU: 113  COMMAND: "node_exporter"
...
    [exception RIP: native_queued_spin_lock_slowpath+402]
    RIP: ffffffff96d3e6c2  RSP: ff4f930938d8fb70  RFLAGS: 00000046
    RAX: 0000000000000000  RBX: 0000000000000246  RCX: 00000000000000dc
    RDX: ff39947a3f26ad00  RSI: 0000000001c80000  RDI: ff399571cff98d90
    RBP: ff399571cff98d80   R8: ff4f930938d8fb80   R9: ff39947377a82d40
    R10: 0000000000000010  R11: 0000000000000200  R12: 00000000006080c0
    R13: ff399571cff98d90  R14: ff3994764df5a288  R15: ff3994764df5a280
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
#12 [ff4f930938d8fb70] native_queued_spin_lock_slowpath at ffffffff96d3e6c2
#13 [ff4f930938d8fb70] _raw_spin_lock_irqsave at ffffffff97554272
#14 [ff4f930938d8fb80] dma_pool_alloc at ffffffff96eceda4
#15 [ff4f930938d8fbc0] mlx5_alloc_cmd_msg at ffffffffc07ff765 [mlx5_core]
#16 [ff4f930938d8fc08] cmd_exec at ffffffffc0802e9f [mlx5_core]
#17 [ff4f930938d8fc88] mlx5_cmd_do at ffffffffc080397e [mlx5_core]
#18 [ff4f930938d8fcb8] mlx5_cmd_exec at ffffffffc08039c7 [mlx5_core]
#19 [ff4f930938d8fcd8] mlx5_core_query_vport_counter at ffffffffc08138d9 [mlx5_core]
#20 [ff4f930938d8fd18] mlx5_ib_process_mad at ffffffffc0712f6e [mlx5_ib]
#21 [ff4f930938d8fdb0] ib_port_register_module_stat at ffffffffc2529581 [ib_core]
#22 [ff4f930938d8fe28] show_pma_counter at ffffffffc2529fc1 [ib_core]
#23 [ff4f930938d8fe50] sysfs_kf_seq_show at ffffffff96fb3a9b
#24 [ff4f930938d8fe68] seq_read at ffffffff96f45a83
#25 [ff4f930938d8fec8] vfs_read at ffffffff96f1c4f1
#26 [ff4f930938d8ff00] ksys_read at ffffffff96f1c92f
#27 [ff4f930938d8ff38] do_syscall_64 at ffffffff96c0420b
#28 [ff4f930938d8ff50] entry_SYSCALL_64_after_hwframe at ffffffff976000ad

Environment

  • Red Hat Enterprise Linux 8.4 (kernel-4.18.0-305.130.1.el8_4.x86_64)
  • Mellanox ConnectX Family mlx5Gen Virtual Function (SR-IOV VF)
    • Third-party mlx5_core driver version 24.10-2.1.8

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content