The kernel crashes in pde_subdir_find() due to RCU stall. nvidia driver is install and loaded.
Issue
- The kernel crashes in pde_subdir_find() due to RCU stall. nvidia driver is install and loaded.
[ 942.185947] Kernel panic - not syncing: RCU Stall
[ 942.260382] CPU: 46 PID: 138498 Comm: rmmod Kdump: loaded Tainted: P OE --------- - - 4.18.0-305.25.1.el8_4.x86_64 #1
[ 942.402943] Hardware name: HPE ProLiant XL675d Gen10 Plus/ProLiant XL675d Gen10 Plus, BIOS A47 02/23/2021
[ 942.518252] Call Trace:
[ 942.547614] <IRQ>
[ 942.571741] dump_stack+0x5c/0x80
[ 942.611588] panic+0xe7/0x2a9
[ 942.647247] rcu_sched_clock_irq.cold.92+0x266/0x3d3
[ 942.707006] ? timekeeping_advance+0x372/0x5a0
[ 942.760474] ? tick_sched_do_timer+0x60/0x60
[ 942.811844] update_process_times+0x24/0x60
[ 942.862166] tick_sched_handle+0x22/0x60
[ 942.909341] tick_sched_timer+0x37/0x70
[ 942.955474] __hrtimer_run_queues+0x100/0x280
[ 943.007893] hrtimer_interrupt+0x100/0x220
[ 943.057172] smp_apic_timer_interrupt+0x6a/0x130
[ 943.112735] apic_timer_interrupt+0xf/0x20
[ 943.162009] </IRQ>
[ 943.187182] RIP: 0010:pde_subdir_find+0x2d/0x70
[ 943.241696] Code: 00 00 41 55 41 54 55 53 48 8b 9f 80 00 00 00 48 85 db 74 3d 49 89 f5 89 d5 eb 0b 74 37 48 8b 5b 08 48 85 db 74 2b 0f b6 43 22 <39> c5 72 1a 77 ed 4c 8d a3 78 ff ff ff 89 ea 4c 89 ef 4c 89 e6 e8
[ 943.468107] RSP: 0018:ffffb60fb469fd58 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
[ 943.559307] RAX: 0000000000000004 RBX: ffff8c2c9d3ad308 RCX: 0000000000000000
[ 943.645262] RDX: 000000000000000c RSI: ffff8c4d153198ab RDI: ffff8c2c9d3aeb00
[ 943.731217] RBP: 000000000000000c R08: ffff8c2cbfed86d8 R09: ffff8c2cbfed8730
[ 943.817173] R10: 0000000000000000 R11: ffffffffb985f308 R12: dead000000000200
[ 943.903135] R13: ffff8c4d153198ab R14: ffff8c4c2f258140 R15: ffff8c4c2f258140
[ 943.989097] remove_proc_subtree+0x74/0x160
[ 944.039494] nvswitch_procfs_device_remove+0x25/0x60 [nvidia]
[ 944.108776] nvswitch_remove.cold.31+0x149/0x177 [nvidia]
[ 944.173774] pci_device_remove+0x3b/0xc0
[ 944.220955] device_release_driver_internal+0x103/0x1f0
[ 944.283854] driver_detach+0x54/0x88
[ 944.326842] bus_remove_driver+0x77/0xc9
[ 944.374022] pci_unregister_driver+0x2d/0xb0
[ 944.425472] nvswitch_exit+0x2c/0x70 [nvidia]
[ 944.477964] nv_module_exit+0x47/0x60 [nvidia]
[ 944.531502] nvidia_exit_module+0x2b/0x50 [nvidia]
[ 944.589165] __x64_sys_delete_module+0x139/0x280
[ 944.644733] do_syscall_64+0x5b/0x1a0
[ 944.688766] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 944.749572] RIP: 0033:0x154fe2d9687b
[ 944.792558] Code: 73 01 c3 48 8b 0d 0d f6 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d dd f5 2b 00 f7 d8 64 89 01 48
[ 945.018969] RSP: 002b:00007ffde3c4f1f8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 945.110170] RAX: ffffffffffffffda RBX: 00005562134ce7c0 RCX: 0000154fe2d9687b
[ 945.196129] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005562134ce828
[ 945.282085] RBP: 0000000000000000 R08: 00007ffde3c4e171 R09: 0000000000000000
[ 945.368040] R10: 0000154fe2e079a0 R11: 0000000000000206 R12: 00007ffde3c4f420
[ 945.454000] R13: 00007ffde3c50a53 R14: 00005562134ce2a0 R15: 00005562134ce7c0
Environment
- Red Hat Enterprise Linux 8.4.z - kernel-4.18.0-305.25.1.el8_4
- Nvidia driver
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.