NMI messages and MCEs on HP ProLiant DL585 G5/G6 or DL685 G5

Solution Verified - Updated -

Issue

  • NMI messages are being received on HP ProLiant DL585 G6 systems:

    Jan 12 22:47:05 example kernel: Uhhuh. NMI received for unknown reason 30. 
    Jan 12 22:47:05 example kernel: Dazed and confused, but trying to continue 
    Jan 12 22:47:05 example kernel: Do you have a strange power saving mode enabled? 
    
  • System rebooting or hanging without generating core dump following above NMI errors

  • System generates a Machine-Check Exception (MCE) referencing bank 4 (indicating Northbridge or DRAM on AMD processors):

CPU 1: Machine Check Exception:             4 Bank 4: ba00000000070f0f
TSC 622520147de MISC e00c0ffe01000000

HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 BANK 4 TSC 622520147de [at 1867 Mhz 0 days 1:0:13 uptime (unreliable)]
MISC e00c0ffe01000000 
MCG status:MCIP 
MCi status:
Uncorrected error
Error enabled
MCi_MISC register valid
Processor context corrupt
MCA: BUS Level-3 Generic Generic Other-transaction Request-no-timeout Error
<16:7> BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE
STATUS ba00000000070f0f MCGSTATUS 4

Environment

  • Red Hat Enterprise Linux (RHEL) 4
  • Red Hat Enterprise Linux 5
  • Red Hat Enterprise Linux 6
  • HP ProLiant DL585 G5 or G6
  • HP ProLiant DL685 G5
  • Broadcom NetXtreme II 5709 NIC

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content