Pods are ImagePullBackOff Status in openshift-marketplace namespace within private ARO cluster

Solution Verified - Updated -

Environment

  • Azure Red Hat OpenShift (ARO)
    • 4

Issue

  • Egress Lockdown feature was enabled, but the OpenShift Logging Operator could not be installed from the Operator Hub in a private ARO cluster without public IP address.
    • Background:
      1. The pull secret is already set up.
      2. The error message show as below:
$ oc -n openshift-marketplace get po
NAME                                    READY   STATUS             RESTARTS      AGE
certified-operators-xxxxx               0/1     ImagePullBackOff   0             32m
certified-operators-xxxxx               0/1     ImagePullBackOff   0             42m
community-operators-xxxxx               0/1     ImagePullBackOff   0             42m
community-operators-xxxxx               0/1     ImagePullBackOff   0             32m
marketplace-operator-xxxxxxxxxx-xxxxx   1/1     Running            4 (29m ago)   50m
redhat-marketplace-xxxxx                0/1     ImagePullBackOff   0             32m
redhat-marketplace-xxxxx                0/1     ImagePullBackOff   0             42m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             42m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
redhat-operators-xxxxx                  0/1     ImagePullBackOff   0             32m
$ oc -n openshift-marketplace describe po certified-operators-xxxxx
Name:                 certified-operators-xxxxx
Namespace:            openshift-marketplace
:
Events:
  Type     Reason          Age                   From               Message
  ----     ------          ----                  ----               -------
  Normal   Scheduled       32m                   default-scheduler  Successfully assigned openshift-marketplace/certified-operators-xxxxx to test-xxxxx-master-2 by test-xxxxx-master-1
  Normal   AddedInterface  31m                   multus             Add eth0 [XX.XXX.X.XX/XX] from ovn-kubernetes
  Normal   Pulling         27m (x4 over 31m)     kubelet            Pulling image 'registry.redhat.io/redhat/certified-operator-index:v4.11'
  Warning  Failed          26m (x4 over 30m)     kubelet            Error: ErrImagePull
  Warning  Failed          25m (x7 over 30m)     kubelet            Error: ImagePullBackOff
  Normal   BackOff         6m25s (x73 over 30m)  kubelet            Back-off pulling image 'registry.redhat.io/redhat/certified-operator-index:v4.11'
  Warning  Failed          91s (x9 over 30m)     kubelet            Failed to pull image 'registry.redhat.io/redhat/certified-operator-index:v4.11': rpc error: code = Unknown desc = pinging container registry registry.redhat.io: Get 'https://registry.redhat.io/v2/': dial tcp XX.XXX.XX.XX:443: i/o timeout

Resolution

Disclaimer: link contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

  • As explained in Control egress traffic for your Azure Red Hat OpenShift (ARO) cluster only the domains listed in the section Minimum FQDN - are routed through the egress lockdown gateway.

  • All the other domains, including registry.redhat.io, will not use egress gateway.

  • As the cluster is a disconnected one, there is no outbound-rule and hence no default egress through public IP, accessing registry.redhat.io will necessarily go through user defined routing. So we should update UDR (and Virtual appliances if any) to ensure that traffic to registry.redhat.io is correctly routed and not blocked.

NOTE: So far, there should be no roadmap to add registry.redhat.io to Minimum Required FQDN, as SRE mirrors all the minimum needed OpenShift Container Platform images into Azure Container Registry.

Diagnostic Steps

  • Check if the pull secret is added to the cluster.
  • Check if the Egress Lockdown feature is enabled.
$ oc get cluster.aro.openshift.io cluster -o go-template='{{ if .spec.gatewayDomains }}{{ 'Egress Lockdown Feature Enabled' }}{{ else }}{{ 'Egress Lockdown Feature Disabled' }}{{ end }}{{ '\\n' }}'
Egress Lockdown Feature Enabled

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments