HPE ProLiant Gen10, Gen9 and Gen8 Servers - Short Durations of Throttling (TCC Activation) May Cause Operating Systems to Issue Machine Check Alerts, Which Is Expected Behavior
Issue
-
The following Machine Check Exception (MCE) alerts observed in
/var/log/messages
file:Mar 30 13:01:01 host.example.com host kernel: CPU21: Package temperature above threshold, cpu clock throttled (total events = 7521) Mar 30 13:01:01 host.example.com host kernel: CPU28: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU25: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU28: Package temperature above threshold, cpu clock throttled (total events = 7527) Mar 30 13:01:01 host.example.com host kernel: CPU17: Core temperature above threshold, cpu clock throttled (total events = 1442) Mar 30 13:01:01 host.example.com host kernel: CPU16: Package temperature above threshold, cpu clock throttled (total events = 7514) Mar 30 13:01:01 host.example.com host kernel: CPU30: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU24: Package temperature above threshold, cpu clock throttled (total events = 7525) Mar 30 13:01:01 host.example.com host kernel: CPU23: Package temperature above threshold, cpu clock throttled (total events = 7495) Mar 30 13:01:01 host.example.com host kernel: CPU22: Package temperature above threshold, cpu clock throttled (total events = 7523) Mar 30 13:01:01 host.example.com host kernel: CPU19: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU27: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU26: Package temperature above threshold, cpu clock throttled (total events = 7525) Mar 30 13:01:01 host.example.com host kernel: CPU16: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU30: Package temperature above threshold, cpu clock throttled (total events = 7526) Mar 30 13:01:01 host.example.com host kernel: CPU18: Package temperature above threshold, cpu clock throttled (total events = 7519) Mar 30 13:01:01 host.example.com host kernel: CPU29: Package temperature above threshold, cpu clock throttled (total events = 7526) Mar 30 13:01:01 host.example.com host kernel: CPU19: Package temperature above threshold, cpu clock throttled (total events = 7504) Mar 30 13:01:01 host.example.com host kernel: CPU27: Package temperature above threshold, cpu clock throttled (total events = 7527) Mar 30 13:01:01 host.example.com host kernel: CPU22: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU23: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU31: Package temperature above threshold, cpu clock throttled (total events = 7526) Mar 30 13:01:01 host.example.com host kernel: CPU20: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU26: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU20: Package temperature above threshold, cpu clock throttled (total events = 7526) Mar 30 13:01:01 host.example.com host kernel: CPU21: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU24: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU29: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU31: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU18: Package temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU17: Core temperature/speed normal Mar 30 13:01:01 host.example.com host kernel: CPU25: Package temperature above threshold, cpu clock throttled (total events = 7526) Mar 30 13:01:01 host.example.com host mcelog: Hardware event. This is not a software error. Mar 30 13:01:01 host.example.com host mcelog: MCE 0 Mar 30 13:01:01 host.example.com host mcelog: CPU 17 THERMAL EVENT TSC 13bbcd79ada298 Mar 30 13:01:01 host.example.com host mcelog: TIME 1522382461 Fri Mar 30 13:01:01 2018 Mar 30 13:01:01 host.example.com host mcelog: Processor 17 heated above trip temperature. Throttling enabled. Mar 30 13:01:01 host.example.com host mcelog: Please check your system cooling. Performance will be impacted Mar 30 13:01:01 host.example.com host mcelog: STATUS 880003c3 MCGSTATUS 0 Mar 30 13:01:01 host.example.com host mcelog: MCGCAP f000814 APICID 22 SOCKETID 1 Mar 30 13:01:01 host.example.com host mcelog: CPUID Vendor Intel Family 6 Model 85 Mar 30 13:01:01 host.example.com host mcelog: Hardware event. This is not a software error. Mar 30 13:01:01 host.example.com host mcelog: MCE 1 Mar 30 13:01:01 host.example.com host mcelog: CPU 17 THERMAL EVENT TSC 13bbcd79adc4c8 Mar 30 13:01:01 host.example.com host mcelog: TIME 1522382461 Fri Mar 30 13:01:01 2018 Mar 30 13:01:01 host.example.com host mcelog: Processor 17 below trip temperature. Throttling disabled Mar 30 13:01:01 host.example.com host mcelog: STATUS 880a0282 MCGSTATUS 0 Mar 30 13:01:01 host.example.com host mcelog: MCGCAP f000814 APICID 22 SOCKETID 1 Mar 30 13:01:01 host.example.com host mcelog: CPUID Vendor Intel Family 6 Model 85
Environment
- Red Hat Enterprise Linux (RHEL)
- HPE ProLiant Gen10 Server
- HPE ProLiant Gen9 Server
- HPE ProLiant Gen8 Server
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.