Hard lockup panic in third-party mlx5 driver
Issue
Kernel panic - not syncing: Hard LOCKUPin RHEL 8 system with third-party Mellanoxmlx5driver. Panic backtrace is:
PANIC: "Kernel panic - not syncing: Hard LOCKUP"
crash> bt
PID: 8781 TASK: ff399479f9bc3c80 CPU: 113 COMMAND: "node_exporter"
...
[exception RIP: native_queued_spin_lock_slowpath+402]
RIP: ffffffff96d3e6c2 RSP: ff4f930938d8fb70 RFLAGS: 00000046
RAX: 0000000000000000 RBX: 0000000000000246 RCX: 00000000000000dc
RDX: ff39947a3f26ad00 RSI: 0000000001c80000 RDI: ff399571cff98d90
RBP: ff399571cff98d80 R8: ff4f930938d8fb80 R9: ff39947377a82d40
R10: 0000000000000010 R11: 0000000000000200 R12: 00000000006080c0
R13: ff399571cff98d90 R14: ff3994764df5a288 R15: ff3994764df5a280
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#12 [ff4f930938d8fb70] native_queued_spin_lock_slowpath at ffffffff96d3e6c2
#13 [ff4f930938d8fb70] _raw_spin_lock_irqsave at ffffffff97554272
#14 [ff4f930938d8fb80] dma_pool_alloc at ffffffff96eceda4
#15 [ff4f930938d8fbc0] mlx5_alloc_cmd_msg at ffffffffc07ff765 [mlx5_core]
#16 [ff4f930938d8fc08] cmd_exec at ffffffffc0802e9f [mlx5_core]
#17 [ff4f930938d8fc88] mlx5_cmd_do at ffffffffc080397e [mlx5_core]
#18 [ff4f930938d8fcb8] mlx5_cmd_exec at ffffffffc08039c7 [mlx5_core]
#19 [ff4f930938d8fcd8] mlx5_core_query_vport_counter at ffffffffc08138d9 [mlx5_core]
#20 [ff4f930938d8fd18] mlx5_ib_process_mad at ffffffffc0712f6e [mlx5_ib]
#21 [ff4f930938d8fdb0] ib_port_register_module_stat at ffffffffc2529581 [ib_core]
#22 [ff4f930938d8fe28] show_pma_counter at ffffffffc2529fc1 [ib_core]
#23 [ff4f930938d8fe50] sysfs_kf_seq_show at ffffffff96fb3a9b
#24 [ff4f930938d8fe68] seq_read at ffffffff96f45a83
#25 [ff4f930938d8fec8] vfs_read at ffffffff96f1c4f1
#26 [ff4f930938d8ff00] ksys_read at ffffffff96f1c92f
#27 [ff4f930938d8ff38] do_syscall_64 at ffffffff96c0420b
#28 [ff4f930938d8ff50] entry_SYSCALL_64_after_hwframe at ffffffff976000ad
Environment
- Red Hat Enterprise Linux 8.4 (
kernel-4.18.0-305.130.1.el8_4.x86_64) - Mellanox ConnectX Family mlx5Gen Virtual Function (SR-IOV VF)
- Third-party
mlx5_coredriver version24.10-2.1.8
- Third-party
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.