HighOverallControlPlaneCPU alert troubleshooting in OpenShift Container Platform 4
Issue
HighOverallControlPlaneCPU
alert is firing in the cluster
This alert is triggered when CPU utilization across all three control plane nodes is higher than two control plane nodes can sustain; a single control plane node outage may cause a cascading failure.
Given three control plane nodes, the overall CPU utilization may only be about two thirds (0.66) of all available capacity. This is because if a single control plane node fails, the remaining two must handle the load of the cluster in order to be HA. If the cluster is using more than two thirds (0.66) of all capacity, if one control plane node fails, the remaining two are likely to fail when they take the load.
Environment
- Red Hat Openshift Container Platform 4 [RHOCP].
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.