HPE ProLiant Gen8, HPE ProLiant Gen9, and HPE ProLiant Gen10 Servers - Short Durations of Throttling (TCC Activation) May Cause Operating Systems to Issue Machine Check Alerts, Which Is Expected Behavior

Solution Verified - Updated -

Issue

  • The following MCE alerts observed in /var/log/messages file:
Mar 30 13:01:01 host.example.com host kernel: CPU21: Package temperature above threshold, cpu clock throttled (total events = 7521)
Mar 30 13:01:01 host.example.com host kernel: CPU28: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU25: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU28: Package temperature above threshold, cpu clock throttled (total events = 7527)
Mar 30 13:01:01 host.example.com host kernel: CPU17: Core temperature above threshold, cpu clock throttled (total events = 1442)
Mar 30 13:01:01 host.example.com host kernel: CPU16: Package temperature above threshold, cpu clock throttled (total events = 7514)
Mar 30 13:01:01 host.example.com host kernel: CPU30: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU24: Package temperature above threshold, cpu clock throttled (total events = 7525)
Mar 30 13:01:01 host.example.com host kernel: CPU23: Package temperature above threshold, cpu clock throttled (total events = 7495)
Mar 30 13:01:01 host.example.com host kernel: CPU22: Package temperature above threshold, cpu clock throttled (total events = 7523)
Mar 30 13:01:01 host.example.com host kernel: CPU19: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU27: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU26: Package temperature above threshold, cpu clock throttled (total events = 7525)
Mar 30 13:01:01 host.example.com host kernel: CPU16: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU30: Package temperature above threshold, cpu clock throttled (total events = 7526)
Mar 30 13:01:01 host.example.com host kernel: CPU18: Package temperature above threshold, cpu clock throttled (total events = 7519)
Mar 30 13:01:01 host.example.com host kernel: CPU29: Package temperature above threshold, cpu clock throttled (total events = 7526)
Mar 30 13:01:01 host.example.com host kernel: CPU19: Package temperature above threshold, cpu clock throttled (total events = 7504)
Mar 30 13:01:01 host.example.com host kernel: CPU27: Package temperature above threshold, cpu clock throttled (total events = 7527)
Mar 30 13:01:01 host.example.com host kernel: CPU22: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU23: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU31: Package temperature above threshold, cpu clock throttled (total events = 7526)
Mar 30 13:01:01 host.example.com host kernel: CPU20: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU26: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU20: Package temperature above threshold, cpu clock throttled (total events = 7526)
Mar 30 13:01:01 host.example.com host kernel: CPU21: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU24: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU29: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU31: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU18: Package temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU17: Core temperature/speed normal
Mar 30 13:01:01 host.example.com host kernel: CPU25: Package temperature above threshold, cpu clock throttled (total events = 7526)
Mar 30 13:01:01 host.example.com host mcelog: Hardware event. This is not a software error.
Mar 30 13:01:01 host.example.com host mcelog: MCE 0
Mar 30 13:01:01 host.example.com host mcelog: CPU 17 THERMAL EVENT TSC 13bbcd79ada298
Mar 30 13:01:01 host.example.com host mcelog: TIME 1522382461 Fri Mar 30 13:01:01 2018
Mar 30 13:01:01 host.example.com host mcelog: Processor 17 heated above trip temperature. Throttling enabled.
Mar 30 13:01:01 host.example.com host mcelog: Please check your system cooling. Performance will be impacted
Mar 30 13:01:01 host.example.com host mcelog: STATUS 880003c3 MCGSTATUS 0
Mar 30 13:01:01 host.example.com host mcelog: MCGCAP f000814 APICID 22 SOCKETID 1
Mar 30 13:01:01 host.example.com host mcelog: CPUID Vendor Intel Family 6 Model 85
Mar 30 13:01:01 host.example.com host mcelog: Hardware event. This is not a software error.
Mar 30 13:01:01 host.example.com host mcelog: MCE 1
Mar 30 13:01:01 host.example.com host mcelog: CPU 17 THERMAL EVENT TSC 13bbcd79adc4c8
Mar 30 13:01:01 host.example.com host mcelog: TIME 1522382461 Fri Mar 30 13:01:01 2018
Mar 30 13:01:01 host.example.com host mcelog: Processor 17 below trip temperature. Throttling disabled
Mar 30 13:01:01 host.example.com host mcelog: STATUS 880a0282 MCGSTATUS 0
Mar 30 13:01:01 host.example.com host mcelog: MCGCAP f000814 APICID 22 SOCKETID 1
Mar 30 13:01:01 host.example.com host mcelog: CPUID Vendor Intel Family 6 Model 85

Environment

  • Red Hat Enterprise Linux
  • HPE ProLiant Gen10 Server

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In