Cluster Autoscaler is not scaling down nodes in OCP 4
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Red Hat OpenShift Service on AWS (ROSA)
- 4
- Red Hat OpenShift Dedicated (OSD)
- 4
- Azure Red Hat OpenShift (ARO)
- 4
- Cluster Autoscaler
Issue
- How to check why the Cluster Autoscaler is not scaling down nodes in OpenShift 4?
-
There are messages in the
cluster-autoscaler
pod preventing nodes to be removed by the Cluster Autoscaler:Skipping [node_name] - node group min size reached [...] Fast evaluation: node [node_name] cannot be removed: pod with local storage present: [pod_name] [...] Fast evaluation: node [node_name] cannot be removed: openshift-marketplace/[pod_name] is not replicated [...] Fast evaluation: node [node_name] cannot be removed: pod annotated as not safe to evict present: [pod_name]
Resolution
Follow the Diagnostic Steps section to find why the nodes cannot be removed.
Common messages
- Message "
Skipping [node_name] - node group min size reached
": The size of the group is currently the minimum size configured, so no more nodes from that group can be removed. - Message "
Node [node_name] is not suitable for removal - cpu utilization too big
": The node cannot be removed as the CPU utilization is high. - Message "
Fast evaluation: node [node_name] cannot be removed: pod with local storage present
": The node cannot be removed as there are pods with local storage in that node. Change the pods to not use local storage to allow the Cluster Autoscaler to scale down nodes running that pod. - Message "
Fast evaluation: node [node_name] cannot be removed: [namespace_name]/[pod_name] is not replicated
": As the pod[pod_name]
in the[namespace_name]
namespace is not replicated, the Cluster Autoscaler cannot scale down the node without impacting the application. Ensure the application have several replicas to allow the Cluster Autoscaler to scale down nodes running that application. - Message "
Fast evaluation: node [node_name] cannot be removed: pod annotated as not safe to evict present: [pod_name]
": The pod[pod_name]
running on[node_name]
has the annotationcluster-autoscaler.kubernetes.io/safe-to-evict
set tofalse
.
Other reasons should be investigated individually.
Known issue with pods in the openshift-marketplace
namespace
If the message shown is similar to the following (could be other pods from the same openshift-marketplace
namespace):
Fast evaluation: node [node_name_4] cannot be removed: openshift-marketplace/[pod_name] is not replicated
It is a known bug in the default CatalogSource
pods, that prevent the nodes to be removed. The bug was tracked in BZ 2019963, and the fix is already published in OCP 4.10.3 as part of BZ 1927478 and errata RHSA-2022:0056.
Note: The fix does not include
CatalogSource
pods that are not part of the default ones provided by OpenShift. For customCatalogSource
pods, it's possible to configure the scheduling as explained in: Catalog source pod scheduling.
Root Cause
There are several reasons to prevent the cluster autoscaler
to scaling down nodes. Refer to How does scale down works in Cluster Autoscaler in OCP 4? for additional information.
Diagnostic Steps
Check the cluster autoscaler
logs to see why the nodes cannot be removed:
$ oc get pods -n openshift-machine-api
[...]
$ oc logs [cluster-autoscaler-pod_name] -n openshift-machine-api
[...]
I0101 00:00:15.318491 1 scale_down.go:443] Node [node_name_1] is not suitable for removal - cpu utilization too big (0.784667)
I0101 00:00:15.319167 1 scale_down.go:443] Node [node_name_2] is not suitable for removal - cpu utilization too big (0.648667)
[...]
I0101 00:00:15.319931 1 cluster.go:148] Fast evaluation: [node_name_3] for removal
I0101 00:00:15.319940 1 cluster.go:169] Fast evaluation: node [node_name_3] cannot be removed: pod with local storage present: sonarqube-lts-sonarqube-lts-0
[...]
I0101 00:00:15.320002 1 cluster.go:148] Fast evaluation: [node_name_4] for removal
I0101 00:00:15.319978 1 cluster.go:169] Fast evaluation: node [node_name_4] cannot be removed: openshift-marketplace/addon-cluster-logging-operator-catalog-xxxxx is not replicated
[...]
I0101 00:00:15.930753 1 clusterapi_controller.go:577] node "[node_name_5]" is in nodegroup "[nodegroup_name]
I0101 00:00:15.930813 1 pre_filtering_processor.go:66] Skipping [node_name_5] - node group min size reached
It's also possible to increase the verbosity or log level of the Cluster Autoscaler, to see if additional information is shown.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments