Machine pool would not auto scale down even with autoscaling enabled
Environment
- Red Hat OpenShift Service on AWS
- 4
Issue
- Machine Autoscaler has been enabled on a MachinePool in a ROSA cluster. A Node with low resource utilisation will not automatically scale down.
- Two nodes with low resource utilisation will not scale down to minimum 1.
Resolution
-
Review the remaining deployments/pods on the node with low resource utilisation.
-
Specifically, review the
resource>requestsof any remainingDeployment/DeploymentConfigon the low resource utilising node(s). -
Check the
cpuormemoryunderresource>requestsand ensure that they are not higher than what is available on existing nodes. -
If the
requests>cpuormemoryare higher, review and lower as needed to meet the resource availability on other/remaining nodes. -
As a result, pods should now be redeployed to other available nodes and the node with low resource utilisation should scale down automatically as expected.
Root Cause
-
If a MachinePool with auto-scaling enabled, is scaling down nodes as expected, however there are nodes that remain active even with low resource utilisation, it is likely that there are deployments within the node that contain
resource>requeststhat are higher than what is available on the remaining nodes. -
As a result, the node with the
DeploymentConfigthat contains a highresource>requestsis unable to scale down automatically.
Diagnostic Steps
- Confirm that resource usage is low on the node(s).
$ oc adm top node | egrep "NAME|worker1"
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
worker1 654m 9% 3935Mi 6%
- Check cluster autoscaler logs for any errors. If presented with logs which state that
node group min size reached, review Applying autoscaling to an OpenShift Container Platform cluster to verify that necessary conditions for node removal by cluster autoscaler are met. Additionally review the resolution above.
$ oc logs pod/cluster-autoscaler-default -n openshift-machine-api
...
...I1025 06:07:30.079913 1 static_autoscaler.go:402] No unschedulable pods
I1025 06:07:32.478040 1 pre_filtering_processor.go:66] Skipping worker1 - node group min size reached
I1025 06:07:32.479018 1 pre_filtering_processor.go:66] Skipping worker2 - node group min size reached
I1025 06:07:32.479905 1 pre_filtering_processor.go:66] Skipping worker3 - node group min size reached
I1025 06:07:34.877804 1 scale_down.go:868] No candidates for scale down
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments