Server with nvidia crashed with message "Kernel panic - not syncing: An NMI occurred, please see the Integrated Management Log for details."
Issue
- Server crashed with following call traces in log.
ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 16 (level, low) -> IRQ 177
hpwdt: New timer passed in is 30 seconds.
hp Watchdog Timer Driver: 1.1.3, timer margin: 30 seconds(nowayout=0), allow kernel dump: ON (default = 0/OFF) <<<-----
BUG: soft lockup - CPU#1 stuck for 60s! [CFXSlaveSF-5.76:21554]
CPU 1:
Modules linked in: sg hpwdt(U) mptctl mptbase autofs4 nfs fscache nfs_acl ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc mlx4_en bonding be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device nvidia(PU) snd_pcm_oss snd_mixer_oss tpm_tis igb shpchp snd_pcm i7core_edac snd_timer 8021q mlx4_core snd_page_alloc tpm pcspkr serio_raw edac_mc tpm_bios snd_hwdep i2c_core hpilo snd dca soundcore dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage ata_piix libata cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 21554, comm: CFXSlaveSF-5.76 Tainted: P 2.6.18-238.9.1.el5 #1 <<<------ ( Proprietary Module is loaded on the system )
RIP: 0010:[<ffffffff8841de46>] [<ffffffff8841de46>] :nvidia:_nv002298rm+0x317/0x5ab <<<------
RSP: 0018:ffff81052be6fb08 EFLAGS: 00000206
RAX: 0000000000000000 RBX: ffff8105761d33c8 RCX: ffffc20015e00000
RDX: 0000000000006858 RSI: 0000000000000000 RDI: ffff810c7b194288
RBP: 0000000000000004 R08: 0000000000000004 R09: 0000000000000004
R10: ffff8105761d32b8 R11: 0000000000000018 R12: ffff8105761d32b8
R13: 0000000000000018 R14: ffff8105761d33c0 R15: 00000000000a5078
FS: 00002ae6f16250e0(0000) GS:ffff810680042a40(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002ae888180000 CR3: 00000006602c1000 CR4: 00000000000006e0
Call Trace:
[<ffffffff8841de01>] :nvidia:_nv002298rm+0x2d2/0x5ab
[<ffffffff88417f38>] :nvidia:_nv002162rm+0x165/0x10c9
[<ffffffff8841c796>] :nvidia:_nv002126rm+0x432/0x539
[<ffffffff8841b565>] :nvidia:_nv002147rm+0x1f6/0x257
[<ffffffff887e0f02>] :nvidia:_nv003998rm+0xa9d5/0xd0b8
[<ffffffff887df5dd>] :nvidia:_nv003998rm+0x90b0/0xd0b8
[<ffffffff8837d4be>] :nvidia:_nv009816rm+0x25/0x40
[<ffffffff88a76d01>] :nvidia:_nv014627rm+0x7c8/0x942
[<ffffffff88a77e51>] :nvidia:_nv001089rm+0x522/0x7a1
[<ffffffff88a6e84c>] :nvidia:rm_init_adapter+0xae/0x1bb
[<ffffffff88a66c90>] :nvidia:_nv014603rm+0x3f/0x6e
[<ffffffff88a930d2>] :nvidia:nv_kern_open+0x597/0x702
[<ffffffff800496d9>] chrdev_open+0x14d/0x183
[<ffffffff8004958c>] chrdev_open+0x0/0x183
[<ffffffff8001eb99>] __dentry_open+0xd9/0x1dc
[<ffffffff8002766e>] do_filp_open+0x2a/0x38
[<ffffffff8001a061>] do_sys_open+0x44/0xbe
[<ffffffff8005d116>] system_call+0x7e/0x83
Kernel panic - not syncing: An NMI occurred, please see the Integrated Management Log for details.
Environment
- Red Hat Enterprise Linux 5
- kernel-2.6.18-238.9.1.el5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.