RHEL 9 : Guest VM encounters kernel panic at atomic_notifier_call_chain() called by 3rd party mlx5_core module.

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux 9
  • VM running in IBM Storage Scale System
  • 3rd party module mlx5_core

Issue

  • System reboots with panic string "Kernel panic - not syncing: Fatal exception in interrupt".
[   79.759826] BUG: unable to handle page fault for address: ffff89bb00746f77
[   79.761336] #PF: supervisor read access in kernel mode
[   79.762116] #PF: error_code(0x0000) - not-present page
[   79.762892] PGD 22e0c01067 P4D 22e0c01067 PUD 0 
[   79.763597] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   79.764272] CPU: 28 PID: 0 Comm: swapper/28 Tainted: G           OE    --------  ---  5.14.0-284.75.1.el9_2.x86_64 #1
[   79.765808] Hardware name: IBM PROTOCOLVM-23E/RHEL, BIOS  
[   79.766632] RIP: 0010:atomic_notifier_call_chain+0x42/0x80
[   79.767464] Code: e8 f3 2c 06 00 48 8b 5b 08 48 85 db 74 4c 41 be ff ff ff ff eb 0e 41 83 ee 01 48 85 db 74 1f 45 85 f6 74 1a 48 89 df 4c 89 ea <48> 8b 5b 08 48 89 ee 48 8b 07 e8 4f bc a1 00 f6 c4 80 74 d8 89 44
[   79.770143] RSP: 0018:ffff98854065cf10 EFLAGS: 00010086
[   79.770932] RAX: 0000000000000001 RBX: ffff89bb00746f6f RCX: 000000000000080b
[   79.771991] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff89bb00746f6f
[   79.773049] RBP: 0000000000000000 R08: ffff89bb54251280 R09: 0000000000000000
[   79.774103] R10: ffff89bb8d5b2120 R11: 0000000000000000 R12: 0000000000000000
[   79.775158] R13: 0000000000000000 R14: 00000000ffffffff R15: ffff89bb534cd400
[   79.776215] FS:  0000000000000000(0000) GS:ffff8a221bd00000(0000) knlGS:0000000000000000
[   79.777401] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   79.778263] CR2: ffff89bb00746f77 CR3: 0000000103c44000 CR4: 0000000000350ee0
[   79.779320] Call Trace:
[   79.779721]  <IRQ>
[   79.780068]  irq_int_handler+0x11/0x20 [mlx5_core]
[   79.780919]  __handle_irq_event_percpu+0x3d/0x190
[   79.781638]  handle_irq_event+0x58/0xb0
[   79.782237]  handle_edge_irq+0x93/0x240
[   79.782829]  __common_interrupt+0x41/0xa0
[   79.783453]  common_interrupt+0x7b/0xa0
[   79.784055]  </IRQ>
[   79.784406]  <TASK>
[   79.784757]  asm_common_interrupt+0x22/0x40
[   79.785405] RIP: 0010:default_idle+0x10/0x20
[   79.786067] Code: 00 0f ae f0 0f ae 38 0f ae f0 eb b5 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 eb 07 0f 00 2d fe ae 2c 00 fb f4 <e9> ab 12 00 00 cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 65
[   79.788738] RSP: 0018:ffff98854017bed0 EFLAGS: 00000206
[   79.789525] RAX: ffffffff97b3fe60 RBX: ffff89bb40b41c80 RCX: 00000000000006e0
[   79.790577] RDX: 00000000000248a9 RSI: 0000000000000083 RDI: 00000000000248aa
[   79.791662] RBP: 0000000000000000 R08: 000ef39e04165812 R09: 0000000000000000
[   79.792746] R10: 0000000000000400 R11: 0000000000000000 R12: 0000000000000000
[   79.793801] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   79.794856]  ? mwait_idle+0x80/0x80
[   79.795410]  default_idle_call+0x33/0xe0
[   79.796021]  cpuidle_idle_call+0x15d/0x1c0
[   79.796651]  ? srso_return_thunk+0x5/0x5f
[   79.797270]  do_idle+0x7b/0xe0
[   79.797756]  cpu_startup_entry+0x19/0x20
[   79.798366]  start_secondary+0x116/0x140
[   79.798979]  secondary_startup_64_no_verify+0xe5/0xeb
[   79.799747]  </TASK>
[   79.810135] CR2: ffff89bb00746f77
[   79.810657] ---[ end trace 0997f62b6459446f ]---
[   79.811361] RIP: 0010:atomic_notifier_call_chain+0x42/0x80
[   79.812195] Code: e8 f3 2c 06 00 48 8b 5b 08 48 85 db 74 4c 41 be ff ff ff ff eb 0e 41 83 ee 01 48 85 db 74 1f 45 85 f6 74 1a 48 89 df 4c 89 ea <48> 8b 5b 08 48 89 ee 48 8b 07 e8 4f bc a1 00 f6 c4 80 74 d8 89 44
[   79.814918] RSP: 0018:ffff98854065cf10 EFLAGS: 00010086
[   79.815705] RAX: 0000000000000001 RBX: ffff89bb00746f6f RCX: 000000000000080b
[   79.816771] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff89bb00746f6f
[   79.817828] RBP: 0000000000000000 R08: ffff89bb54251280 R09: 0000000000000000
[   79.818888] R10: ffff89bb8d5b2120 R11: 0000000000000000 R12: 0000000000000000
[   79.819943] R13: 0000000000000000 R14: 00000000ffffffff R15: ffff89bb534cd400
[   79.820997] FS:  0000000000000000(0000) GS:ffff8a221bd00000(0000) knlGS:0000000000000000
[   79.822180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   79.823085] CR2: ffff89bb00746f77 CR3: 0000000103c44000 CR4: 0000000000350ee0
[   79.824167] Kernel panic - not syncing: Fatal exception in interrupt
[   79.826067] Kernel Offset: 0x16000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   79.827683] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Resolution

  • Engage the module vendor of mlx5_core NVIDIA for further investigation and troubleshooting of the issue.

Root Cause

  • The kernel panicked because of invalid pointer dereference in interrupt context. The notifier chains which caused the panic is corrupt and the irq_int_handler() function comes from mlx5_core module that is shipped by 3rd party vendor.

Diagnostic Steps

  • The kernel panicked because of invalid pointer dereference in interrupt context:
crash> sys | grep PANIC
       PANIC: "Kernel panic - not syncing: Fatal exception in interrupt"

dmesg log:
[   79.759826] BUG: unable to handle page fault for address: ffff89bb00746f77
[   79.761336] #PF: supervisor read access in kernel mode
[   79.762116] #PF: error_code(0x0000) - not-present page
  • The stack trace of the CPU which panicked the kernel shows that function atomic_notifier_call_chain(), the function was called by irq_int_handler():
crash> bt
PID: 0        TASK: ffff89bb40b41c80  CPU: 28   COMMAND: "swapper/28"
 #0 [ffff98854065cd40] panic at ffffffff97ae0fbf
 #1 [ffff98854065cdc0] oops_end.cold at ffffffff97ad9e99
 #2 [ffff98854065cde0] page_fault_oops at ffffffff9707c04b
 #3 [ffff98854065ce38] exc_page_fault at ffffffff97b31798
 #4 [ffff98854065ce60] asm_exc_page_fault at ffffffff97c00ba2
    [exception RIP: atomic_notifier_call_chain+0x42]
    RIP: ffffffff97125262  RSP: ffff98854065cf10  RFLAGS: 00010086
    RAX: 0000000000000001  RBX: ffff89bb00746f6f  RCX: 000000000000080b
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff89bb00746f6f
    RBP: 0000000000000000   R8: ffff89bb54251280   R9: 0000000000000000
    R10: ffff89bb8d5b2120  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 00000000ffffffff  R15: ffff89bb534cd400
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffff98854065cf38] irq_int_handler at ffffffffc08ba821 [mlx5_core]
 #6 [ffff98854065cf40] __handle_irq_event_percpu at ffffffff971764bd
 #7 [ffff98854065cf78] handle_irq_event at ffffffff971766e8
 #8 [ffff98854065cfa8] handle_edge_irq at ffffffff9717afd3
 #9 [ffff98854065cfc8] __common_interrupt at ffffffff970275e1
#10 [ffff98854065cff0] common_interrupt at ffffffff97b2f61b
--- <IRQ stack> ---
#11 [ffff98854017be28] asm_common_interrupt at ffffffff97c00d62
    [exception RIP: default_idle+0x10]
    RIP: ffffffff97b3fe70  RSP: ffff98854017bed0  RFLAGS: 00000206
    RAX: ffffffff97b3fe60  RBX: ffff89bb40b41c80  RCX: 00000000000006e0
    RDX: 00000000000248a9  RSI: 0000000000000083  RDI: 00000000000248aa
    RBP: 0000000000000000   R8: 000ef39e04165812   R9: 0000000000000000
    R10: 0000000000000400  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#12 [ffff98854017bed0] default_idle_call at ffffffff97b3ffd3
#13 [ffff98854017bed8] cpuidle_idle_call at ffffffff9714ffad
#14 [ffff98854017bf10] do_idle at ffffffff9715008b
#15 [ffff98854017bf28] cpu_startup_entry at ffffffff971502a9
#16 [ffff98854017bf38] start_secondary at ffffffff97061546
#17 [ffff98854017bf50] secondary_startup_64_no_verify at ffffffff9700015a
  • irq_int_handler() function is provided by mlx5_core module and the mlx5_core driver that is used is not provided by Red Hat:
crash> sym irq_int_handler
ffffffffc08ba810 (t) irq_int_handler [mlx5_core] /usr/src/debug/kernel-5.14.0-284.75.1.el9_2/linux-5.14.0-284.75.1.el9_2.x86_64/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c: 264

crash> mod -t | grep mlx5_core
mlx5_core   OE

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments