HP server crashed with "OsKcsExecCmd: IPMI NetFN 0x6 CMD: 0x25 has timed out!" message in the logs

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux

  • HP Prolaint

Issue

  • The following message is seen in the logs before the crash :

    Feb  2 17:24:46 server ntpd[6072]: synchronized to 209.135.35.181, stratum 2
    Feb  2 20:28:24 server hpasmlited[7505]: OsKcsExecCmd:  IPMI NetFN  0x6   CMD: 0x25 has timed out! 
    Feb  2 15:33:06 server syslogd 1.4.1: restart.
    Feb  2 15:33:06 server kernel: klogd 1.4.1, log source = /proc/kmsg started.
    

Resolution

  • This is a HP firmware issue and HP has recommended all customers to update the firmware.

Root Cause

  • The following links give some insight :

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=120&prodSeriesId=316587&prodTypeId=15351&objectID=c01330219
http://forums13.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1297095468105+28353475&threadId=1135440
http://forums13.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1297095454991+28353475&threadId=1218554

Diagnostic Steps

ipmi_si: Trying PCI-specified kcs state machine at mem address 0xf5ef0000, slave address 0x0, irq 10
IRQ handler type mismatch for IRQ 10

Call Trace:
 [<ffffffff800b80e9>] setup_irq+0x1b7/0x1cf
 [<ffffffff885689b6>] :ipmi_si:si_irq_handler+0x0/0x5c
 [<ffffffff800b81b1>] request_irq+0xb0/0xd6
 [<ffffffff88569661>] :ipmi_si:std_irq_setup+0x6e/0xc2
 [<ffffffff8856a76c>] :ipmi_si:smi_start_processing+0x1b/0xde
 [<ffffffff885582dc>] :ipmi_msghandler:ipmi_register_smi+0x2d2/0xd86
 [<ffffffff8004aba9>] try_to_del_timer_sync+0x51/0x5a
 [<ffffffff8005b4a4>] del_timer_sync+0xc/0x16
 [<ffffffff80063964>] schedule_timeout+0x92/0xad
 [<ffffffff885691d7>] :ipmi_si:try_smi_init+0x453/0x65c
 [<ffffffff8856a23a>] :ipmi_si:ipmi_pci_probe+0xa0/0x187
 [<ffffffff8015506a>] pci_device_probe+0x104/0x184
 [<ffffffff801b6e18>] driver_probe_device+0x52/0xaa
 [<ffffffff801b6f47>] __driver_attach+0x65/0xb6
 [<ffffffff801b6ee2>] __driver_attach+0x0/0xb6
 [<ffffffff801b6761>] bus_for_each_dev+0x43/0x6e
 [<ffffffff801b63a7>] bus_add_driver+0x7e/0x130
 [<ffffffff80155242>] __pci_register_driver+0x4b/0x6c
 [<ffffffff88569ff5>] :ipmi_si:init_ipmi_si+0x5ef/0x786
 [<ffffffff800a3e6a>] sys_init_module+0xaf/0x1e8
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

ipmi_si: ipmi_si unable to claim interrupt 10, running polled
spurious 8259A interrupt: IRQ7.
ipmi: interfacing existing BMC (man_id: 0x00000b, prod_id: 0x0000, dev_id: 0x11)
 IPMI kcs interface initialized
ipmi device interface
[..]

NMI Watchdog detected LOCKUP on CPU 0
CPU 0 
Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler nfsd exportfs lockd nfs_acl auth_rpcgss ipv6 xfrm_nalgo crypto_api sunrpc bonding dm_round_robin dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sg ide_cd cdrom i2c_piix4 qla2400(U) e1000e i2c_core serio_raw qla2xxx(FU) hpilo shpchp bnx2 pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod cciss ext3 jbd uhci_hcd ohci_hcd ehci_hcd sd_mod scsi_mod qla2xxx_conf(FU) intermodule(U)
Pid: 6832, comm: kipmi0 Tainted: GF     2.6.18-128.1.1.el5 #1
RIP: 0010:[<ffffffff80013229>]  [<ffffffff80013229>] mask_and_ack_8259A+0x42/0xd2
RSP: 0018:ffffffff80425e38  EFLAGS: 00000002
RAX: 00000000000000a4 RBX: 0000000000000400 RCX: 000000000000000a
RDX: ffffffff8041df00 RSI: ffffffff80425e98 RDI: ffffffff802f15b8
RBP: 000000000000000a R08: ffff81032c5084f8 R09: ffff810508874500
R10: ffff81032bbd91c0 R11: ffffffff8805e5a6 R12: 0000000000000016
R13: 000000000000000a R14: ffffffff803c4dbc R15: ffffffff80425e98
FS:  00002ab91f8332b0(0000) GS:ffffffff803ac000(0000) knlGS:00000000f7eda6c0
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000ff927a40 CR3: 000000031d71e000 CR4: 00000000000006e0
Process kipmi0 (pid: 6832, threadinfo ffff81062be8a000, task ffff81032fbd8100)
Stack:  ffffffff803c4d80 0000000000000a00 000000000000000a ffffffff800b7aa3
 ffff810508874500 000000000000000a ffffffff80425e98 ffffffff803c3f80
 000000000000000a 0000000000000000 ffffffff8009d916 ffffffff8006c95d
Call Trace:
 <IRQ>  [<ffffffff800b7aa3>] __do_IRQ+0x5c/0x103
 [<ffffffff8009d916>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8006c95d>] do_IRQ+0xe7/0xf5
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa
 [<ffffffff8805e5a6>] :scsi_mod:scsi_done+0x0/0x18
 [<ffffffff80011f84>] __do_softirq+0x51/0x133
 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
 [<ffffffff8006cada>] do_softirq+0x2c/0x85
 [<ffffffff8006c962>] do_IRQ+0xec/0xf5
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff80064c08>] _spin_unlock_irqrestore+0x8/0x9
 [<ffffffff88568ae6>] :ipmi_si:ipmi_thread+0x48/0x74
 [<ffffffff88568a9e>] :ipmi_si:ipmi_thread+0x0/0x74
 [<ffffffff80032360>] kthread+0xfe/0x132
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff8009d916>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032262>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11


Code: 40 88 e8 83 e0 07 83 c0 60 0f b6 c0 e6 a0 b0 62 eb 11 e4 21 
BUG: warning at arch/x86_64/kernel/crash.c:148/nmi_shootdown_cpus() (Tainted: GF    )

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.