Kernel panic with nvidia_open

Solution Verified - Updated -

Issue

  • Kernel panic with logs:
[1065876.089887] NVRM: GPU 0000:b3:00.0: RmInitAdapter failed! (0x23:0x65:1540)
[1065876.089922] NVRM: GPU 0000:b3:00.0: rm_init_adapter failed, device minor number 7
[1065933.026967] NVRM: GPU 0000:b3:00.0: RmInitAdapter failed! (0x23:0x65:1540)
[1065933.027006] NVRM: GPU 0000:b3:00.0: rm_init_adapter failed, device minor number 7
[1065995.680473] list_add corruption. next->prev should be prev (ffff8d58b0cc47c8), but was 0000000000000000. (next=ffff8d5782264260).
[1065995.680614] ------------[ cut here ]------------
[1065995.680615] kernel BUG at lib/list_debug.c:25!
[1065995.680710] invalid opcode: 0000 [#1] SMP NOPTI
[1065995.680813] CPU: 27 PID: 3842608 Comm: python3 Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-372.75.1.el8_6.x86_64 #1
[1065995.680928] Hardware name: Supermicro CSRL20A/X11DGO-T, BIOS 3.4a 03/11/2021
[1065995.681023] RIP: 0010:__list_add_valid.cold.0+0x12/0x28
[1065995.681118] Code: 00 48 8b 50 08 48 39 f2 0f 85 46 00 00 00 b8 01 00 00 00 e9 20 71 72 00 48 89 d1 48 c7 c7 60 80 32 aa 48 89 c2 e8 62 a0 c8 ff <0f> 0b 48 89 c1 4c 89 c6 48 c7 c7 b8 80 32 aa e8 4e a0 c8 ff 0f 0b
[1065995.681255] RSP: 0018:ffffb85f1ab0fc40 EFLAGS: 00010246
[1065995.681345] RAX: 0000000000000075 RBX: ffff8cf753088000 RCX: 0000000000000000
[1065995.681441] RDX: 0000000000000000 RSI: ffff8db3ff9d6798 RDI: ffff8db3ff9d6798
[1065995.681536] RBP: ffff8cf753088260 R08: 0000000000000000 R09: c0000000ffff7fff
[1065995.681631] R10: 0000000000000001 R11: ffffb85f1ab0fa60 R12: ffff8d5782264260
[1065995.681754] R13: ffff8d58b0cc4000 R14: ffff8d58b0cc47b0 R15: ffff8d58b0cc47c8
[1065995.681903] FS:  00007fa26315e740(0000) GS:ffff8db3ff9c0000(0000) knlGS:0000000000000000
[1065995.682057] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1065995.682196] CR2: 00007fa25e0e1a90 CR3: 000000616ba02003 CR4: 00000000007706e0
[1065995.682346] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1065995.682495] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1065995.682644] PKRU: 55555554
[1065995.682750] Call Trace:
[1065995.682857]  nvidia_open+0x27d/0x450 [nvidia]
[1065995.683357]  chrdev_open+0xcb/0x1e0
[1065995.683469]  ? cdev_default_release+0x20/0x20
[1065995.683582]  do_dentry_open+0x132/0x350
[1065995.683694]  path_openat+0x542/0x14f0
[1065995.683808]  ? security_inode_alloc+0x24/0x90
[1065995.683923]  ? kmem_cache_alloc+0x13f/0x280
[1065995.684039]  ? security_inode_alloc+0x45/0x90
[1065995.684152]  do_filp_open+0x93/0x100
[1065995.684263]  ? getname_flags+0x4a/0x1e0
[1065995.684375]  ? __check_object_size+0xac/0x173
[1065995.684491]  do_sys_open+0x188/0x220
[1065995.684601]  do_syscall_64+0x5b/0x1b0
[1065995.684714]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[1065995.684833] RIP: 0033:0x7fa26259a20f
[1065995.684943] Code: 52 89 f0 25 00 00 41 00 3d 00 00 41 00 74 44 8b 05 46 d2 20 00 85 c0 75 65 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 9d 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
[1065995.685161] RSP: 002b:00007ffc91f2fdf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[1065995.685312] RAX: ffffffffffffffda RBX: 00007ffc91f2fe80 RCX: 00007fa26259a20f
[1065995.685462] RDX: 0000000000080802 RSI: 00007ffc91f2fe80 RDI: 00000000ffffff9c
[1065995.685611] RBP: 00007ffc91f2ff30 R08: 0000000000000000 R09: 00007ffc91f2fba7
[1065995.685761] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000080802
[1065995.685910] R13: 0000000000000007 R14: 0000000000000802 R15: 00007fa25f174ea0
[1065995.686060] Modules linked in: nf_tables nfnetlink dm_mod nvidia_uvm(OE) nfsv3 nfs_acl nfs lockd grace fscache sunrpc nvidia_drm(POE) nvidia_modeset(POE) intel_rapl_msr iTCO_wdt iTCO_vendor_support intel_rapl_common isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate ipmi_ssif nvidia(POE) intel_uncore joydev pcspkr mei_me mei i2c_i801 lpc_ich ioatdma acpi_ipmi ipmi_si acpi_power_meter acpi_pad binfmt_misc xfs libcrc32c nouveau drm_vram_helper drm_ttm_helper mxm_wmi ttm video i2c_algo_bit drm_kms_helper ixgbe nvme syscopyarea ahci sysfillrect sysimgblt fb_sys_fops libahci nvme_core mdio drm crc32c_intel t10_pi libata dca wmi ipmi_devintf ipmi_msghandler fuse

Environment

  • Red Hat Enterprise Linux 8
  • nvidia 3rd party module 550.90.07 / 555.42.02

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content