Cluster operator image-registry degraded in multiple availability zones not supported region

  • Azure Red Hat OpenShift
  • Multiple availability zones not supported region


  • Cluster operator image-registry due to one of image-registry pod in pending status with following warning:
0/9 nodes are available: 3 node(s) didn't match pod topology spread constraints (missing required label), 3 node(s) had untolerated taint { }, 3 node(s) had untolerated taint { }, 5 node(s) didn't match pod topology spread constraints. preemption: 0/9 nodes are available: 3 node(s) didn't match pod topology spread constraints, 6 Preemption is not helpful for scheduling.


Step 1
Scale-up the machine set to create new machines which will be added to the availability set and at least 1 of them will be assigned the faultDomain of 1 resulting in one of them having the missing of 1 allowing the pod to be scheduled.

Step 2
Delete the old machines to make sure that the only workers left around are part of the availabilitySet.

Step 3
Scale-down the machineset back.

Root Cause

  • In clusters that are Installed from OpenShift v4.10 onwards in a region with a single availability zone, the cluster-api-azure plugin in the machine-api controller will set one worker to have a of "0" and another to "1"
  • This happens because the machines that are added to the machineset are automatically added to an availabilitySet. - The issue is that in clusters created prior to OpenShift v4.10, the existing nodes in the machineset haven't been deleted/recreated to be added to an availability zone. This means the only nodes that these pods can be scheduled on all have 0.

Diagnostic Steps

  • Check cluster operator:
$ oc get co
image-registry                             4.11.28        True        True          True       2y
  • Check pod in openshift-image-registry project:
$ oc get pod -n openshift-image-registry |grep -i image
cluster-image-registry-operator-fd7c9fb9d-fbx5c   1/1     Running     0          20d
image-registry-7df957fc6b-l4sws                   0/1     Pending     0          6d
image-registry-7df957fc6b-l5dqb                   1/1     Running     0          19d
  • Check events in in openshift-image-registry project:
$ oc get event -n openshift-image-registry
LAST SEEN   TYPE      REASON              OBJECT                                MESSAGE
13d         Warning   FailedScheduling    pod/image-registry-7df957fc6b-crcdk   0/9 nodes are available: 3 node(s) didn't match pod topology spread constraints (missing required label), 3 node(s) had untolerated taint { }, 3 node(s) had untolerated taint { }, 5 node(s) didn't match pod topology spread constraints. preemption: 0/9 nodes are available: 3 node(s) didn't match pod 
topology spread constraints, 6 Preemption is not helpful for scheduling.

