IPMI errors on Nehalem systems with Winbond BMC

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 5
  • Intel Nehalem CPUs
  • Winbond Base-board Management Controller (BMC)
  • SuperMicro motherboard

Issue

ipmitool is reporting an error while reporting sensor information:

# ipmitool sdr
CPU1 Temp        | 0 unspecified     | ok
CPU2 Temp        | 0 unspecified     | ok
Get SDR 0004 command failed: Unspecified error
CPU2 Temp        | disabled          | ns
CPU1 Vcore       | 0.86 Volts        | ok
CPU2 Vcore       | 0.87 Volts        | ok
+5V              | 5.12 Volts        | ok

At the same time the following message is sent to /var/log/messages:

IPMI message handler: BMC returned incorrect response, expected netfn 7 cmd 35, got netfn 5 cmd 2d

Also, the kipmiN kernel helper threads are generating high CPU load.

Resolution

Occurances of this issue are reported to have disappeared after updating the Winbond BMC's firmware to newer versions available from the motherboard vendor, SuperMicro (at ftp.supermicro.com). Specifically, the following is now used:

# ipmitool mc info
Device ID                 : 32
Device Revision           : 1
Firmware Revision         : 1.12
IPMI Version              : 2.0
Manufacturer ID           : 47488
Manufacturer Name         : Unknown (0xB980)
Product ID                : 43707 (0xaabb)
Product Name              : Unknown (0xAABB)
Device Available          : yes
Provides Device SDRs      : no

Root Cause

Suspected BMC firmware issue.

Diagnostic Steps

# ipmitool sdr
CPU1 Temp        | 0 unspecified     | ok
CPU2 Temp        | 0 unspecified     | ok
Get SDR 0004 command failed: Unspecified error
CPU2 Temp        | disabled          | ns
CPU1 Vcore       | 0.86 Volts        | ok
CPU2 Vcore       | 0.87 Volts        | ok
+5V              | 5.12 Volts        | ok

Obtain BMC details:

# ipmitool mc info
Device ID                 : 32
Device Revision           : 1
Firmware Revision         : 1.9
IPMI Version              : 2.0
Manufacturer ID           : 47488
Manufacturer Name         : Unknown (0xB980)
Product ID                : 43707 (0xaabb)
Product Name              : Unknown (0xAABB)

After updating to the newer firmware:

Device ID                 : 32
Device Revision           : 1
Firmware Revision         : 1.32
IPMI Version              : 2.0
Manufacturer ID           : 47488
Manufacturer Name         : Unknown (0xB980)
Product ID                : 43707 (0xaabb)
Product Name              : Unknown (0xAABB)
Device Available          : yes
Provides Device SDRs      : no
Additional Device Support :
    Sensor Device
    SDR Repository Device
    SEL Device
    FRU Inventory Device
    IPMB Event Receiver
    IPMB Event Generator
    Chassis Device
Aux Firmware Rev Info     :
    0x01
    0x00
    0x00
    0x00

Comments

Upstream mailing list thread suggests this may be a hardware, BMC firmware or BIOS issue; the "incorrect response" message indicates that the IPMI device and the IPMI handler code have gone out of synchronisation.

Refer to kipmi kernel helper thread kipmi0 is generating high CPU load for general information about kipmiN kernel helper threads generating high CPU load.

  • Component
  • hal

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments