Cluster Autoscaler not balancing nodes across Availability Zones in OpenShift 4
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Red Hat OpenShift Service on AWS (ROSA)
- 4
- Red Hat OpenShift Dedicated (OSD)
- 4
- Cluster Autoscaler
Issue
- Having 2 Availability Zones, such as
west-1a
andwest-1b
, theMachineAutoScaler
is configured forMachineSets
of both the zones. But the Cluster Autoscaler does not provision worker nodes evenly across both theMachineSets
. - Nodes scaled up unevenly across Availability Zones when using Cluster Autoscaler in OpenShift 4.
- Is it possible to use the
balanceSimilarNodeGroups
option in the Cluster Autoscaler in OpenShift 4?
Resolution
Setting the balanceSimilarNodeGroups
property to true
in the ClusterAutoscaler
resource as shown below will help to balance OCP nodes across the different MachineSets
:
apiVersion: "autoscaling.openshift.io/v1"
kind: "ClusterAutoscaler"
metadata:
name: "default"
spec:
balanceSimilarNodeGroups: true
Note: The
balanceSimilarNodeGroups
in the defaultClusterAutoscaler
is already configured tofalse
in OSD/ROSA clusters ROSA-Doc. As part of HIVE-1976 to allow customer to configure.
Root Cause
Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.
The balanceSimilarNodeGroups
enables/disables the --balance-similar-node-groups
feature of the Cluster Autocaler. This feature will automatically identify node groups with the same instance type and the same set of labels and try to keep the respective sizes of those node groups balanced.
Note: currently the balancing is only done at scale-up.
Diagnostic Steps
Check the config of the default ClusterAutoscaler
:
$ oc get clusterautoscaler default -o yaml
apiVersion: autoscaling.openshift.io/v1
kind: ClusterAutoscaler
[...]
spec:
balanceSimilarNodeGroups: true
[...]
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments