Scaling up and scaling down OpenShift 4.x nodes, and draining and rescheduling pods during planned maintenance results in performance issues

  • Red Hat OpenShift Container Platform 4.x


  • When conducting upgrades or other planned maintenance, the OpenShift Cluster Autoscaler will attempt to scale up/scale down the amount of nodes in the cluster
  • Scaling up/scaling down nodes and draining and rescheduling pods is time and compute resource intensive


  • A Request for Feature Enhancement (RFE) has been submitted
  • RFE-3281 will address adding a way for a cluster admin to manually pause autoscaling in order to conduct scheduled maintenance with less node churn

Root Cause

  • The OpenShift cluster autoscaler attempts to react to node maintenance (cordon, drain, reboot) by creating more nodes and scheduling pods. This adds additional compute overhead and prolongs the planned maintenance/upgrade.

