Kernel panic with nvidia_open
Issue
- Kernel panic with logs:
[1065876.089887] NVRM: GPU 0000:b3:00.0: RmInitAdapter failed! (0x23:0x65:1540)
[1065876.089922] NVRM: GPU 0000:b3:00.0: rm_init_adapter failed, device minor number 7
[1065933.026967] NVRM: GPU 0000:b3:00.0: RmInitAdapter failed! (0x23:0x65:1540)
[1065933.027006] NVRM: GPU 0000:b3:00.0: rm_init_adapter failed, device minor number 7
[1065995.680473] list_add corruption. next->prev should be prev (ffff8d58b0cc47c8), but was 0000000000000000. (next=ffff8d5782264260).
[1065995.680614] ------------[ cut here ]------------
[1065995.680615] kernel BUG at lib/list_debug.c:25!
[1065995.680710] invalid opcode: 0000 [#1] SMP NOPTI
[1065995.680813] CPU: 27 PID: 3842608 Comm: python3 Kdump: loaded Tainted: P OE --------- - - 4.18.0-372.75.1.el8_6.x86_64 #1
[1065995.680928] Hardware name: Supermicro CSRL20A/X11DGO-T, BIOS 3.4a 03/11/2021
[1065995.681023] RIP: 0010:__list_add_valid.cold.0+0x12/0x28
[1065995.681118] Code: 00 48 8b 50 08 48 39 f2 0f 85 46 00 00 00 b8 01 00 00 00 e9 20 71 72 00 48 89 d1 48 c7 c7 60 80 32 aa 48 89 c2 e8 62 a0 c8 ff <0f> 0b 48 89 c1 4c 89 c6 48 c7 c7 b8 80 32 aa e8 4e a0 c8 ff 0f 0b
[1065995.681255] RSP: 0018:ffffb85f1ab0fc40 EFLAGS: 00010246
[1065995.681345] RAX: 0000000000000075 RBX: ffff8cf753088000 RCX: 0000000000000000
[1065995.681441] RDX: 0000000000000000 RSI: ffff8db3ff9d6798 RDI: ffff8db3ff9d6798
[1065995.681536] RBP: ffff8cf753088260 R08: 0000000000000000 R09: c0000000ffff7fff
[1065995.681631] R10: 0000000000000001 R11: ffffb85f1ab0fa60 R12: ffff8d5782264260
[1065995.681754] R13: ffff8d58b0cc4000 R14: ffff8d58b0cc47b0 R15: ffff8d58b0cc47c8
[1065995.681903] FS: 00007fa26315e740(0000) GS:ffff8db3ff9c0000(0000) knlGS:0000000000000000
[1065995.682057] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1065995.682196] CR2: 00007fa25e0e1a90 CR3: 000000616ba02003 CR4: 00000000007706e0
[1065995.682346] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1065995.682495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1065995.682644] PKRU: 55555554
[1065995.682750] Call Trace:
[1065995.682857] nvidia_open+0x27d/0x450 [nvidia]
[1065995.683357] chrdev_open+0xcb/0x1e0
[1065995.683469] ? cdev_default_release+0x20/0x20
[1065995.683582] do_dentry_open+0x132/0x350
[1065995.683694] path_openat+0x542/0x14f0
[1065995.683808] ? security_inode_alloc+0x24/0x90
[1065995.683923] ? kmem_cache_alloc+0x13f/0x280
[1065995.684039] ? security_inode_alloc+0x45/0x90
[1065995.684152] do_filp_open+0x93/0x100
[1065995.684263] ? getname_flags+0x4a/0x1e0
[1065995.684375] ? __check_object_size+0xac/0x173
[1065995.684491] do_sys_open+0x188/0x220
[1065995.684601] do_syscall_64+0x5b/0x1b0
[1065995.684714] entry_SYSCALL_64_after_hwframe+0x61/0xc6
[1065995.684833] RIP: 0033:0x7fa26259a20f
[1065995.684943] Code: 52 89 f0 25 00 00 41 00 3d 00 00 41 00 74 44 8b 05 46 d2 20 00 85 c0 75 65 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 9d 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
[1065995.685161] RSP: 002b:00007ffc91f2fdf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[1065995.685312] RAX: ffffffffffffffda RBX: 00007ffc91f2fe80 RCX: 00007fa26259a20f
[1065995.685462] RDX: 0000000000080802 RSI: 00007ffc91f2fe80 RDI: 00000000ffffff9c
[1065995.685611] RBP: 00007ffc91f2ff30 R08: 0000000000000000 R09: 00007ffc91f2fba7
[1065995.685761] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000080802
[1065995.685910] R13: 0000000000000007 R14: 0000000000000802 R15: 00007fa25f174ea0
[1065995.686060] Modules linked in: nf_tables nfnetlink dm_mod nvidia_uvm(OE) nfsv3 nfs_acl nfs lockd grace fscache sunrpc nvidia_drm(POE) nvidia_modeset(POE) intel_rapl_msr iTCO_wdt iTCO_vendor_support intel_rapl_common isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate ipmi_ssif nvidia(POE) intel_uncore joydev pcspkr mei_me mei i2c_i801 lpc_ich ioatdma acpi_ipmi ipmi_si acpi_power_meter acpi_pad binfmt_misc xfs libcrc32c nouveau drm_vram_helper drm_ttm_helper mxm_wmi ttm video i2c_algo_bit drm_kms_helper ixgbe nvme syscopyarea ahci sysfillrect sysimgblt fb_sys_fops libahci nvme_core mdio drm crc32c_intel t10_pi libata dca wmi ipmi_devintf ipmi_msghandler fuse
Environment
- Red Hat Enterprise Linux 8
- nvidia 3rd party module
550.90.07/555.42.02
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.