Troubleshooting OpenShift Container Platform 4.x: Image Registry Operator

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform
    • 4.x

Issue

  • How do I troubleshoot issues with the image registry in Openshift 4

Diagnostic Steps

  • The registry operator reports status in two places:
    • ClusterOperator resource is defined in the cluster scope which reflects the state of the registry operator at a high level.
    • image-registry resource itself also has a status section with detailed conditions indicating the state of the managed registry:
  • Health Checking

Openshift 4.0-4.3

# oc get clusteroperators.config.openshift.io/cluster-image-registry-operator -o yaml -n openshift-image-registry
# oc get configs.imageregistry.operator.openshift.io/instance  -o yaml -n openshift-image-registry

Openshift 4.4-4.x

# oc get clusteroperators.config.openshift.io/image-registry -o yaml  -n openshift-image-registry
# oc get configs.imageregistry.operator.openshift.io/cluster -o yaml
  • If one cannot access your registry, check for a registry deployment and corresponding pod in the openshift-image-registry namespace:
# oc get deployment image-registry -n openshift-image-registry
# oc get pods -n openshift-image-registry | grep image-registry | grep -v operator
  • If there is no registry pod, check the deployment for any error conditions:
# oc get deployment image-registry -o yaml -n openshift-image-registry
  • If there is no registry deployment, check the image-registry resource instance for any error conditions:

Openshift 4.0-4.3

# oc get configs.imageregistry.operator.openshift.io/instance -o yaml -n openshift-image-registry

Openshift 4.4-4.x

# oc get configs.imageregistry.operator.openshift.io/cluster -o yaml -n openshift-image-registry
  • If there is no image-registry resource at all, check if the image-registry operator deployment exists:
# oc get deployment/cluster-image-registry-operator -n openshift-image-registry
  • If the operator deployment exists, check for the corresponding pod and, if it exists, check its logs:
# POD=$(oc get pods  -n openshift-image-registry | awk '/cluster-image-registry-operator/ {print $1}')
# oc logs ${POD} -n openshift-image-registry
  • If the operator pod does not exist, inspect the deployment to determine why the operator pod was not created:
# oc get deployment cluster-image-registry-operator -o yaml -n openshift-image-registry
  • If the deployment does not exist, then something is wrong at the installer/CVO level that it did not deploy the image-registry operator.

Health Checking

  • A basic health check to verify that the internal registry is running and responding to its service address is to "ping" its /healthz path. Under normal circumstances this should return a HTTP 200 response. This is to be run on a master/node in the cluster.
# RegistryAddr=$(oc get svc image-registry -n openshift-image-registry -o 'jsonpath={.spec.clusterIP}:{.spec.ports[0].port}')

# curl -vk $RegistryAddr/healthz
OR
# curl -vk https://$RegistryAddr/healthz
  • The registry pod IP should be checked as well if the service fails.
# oc get pods -n openshift-image-registry -l docker-registry=default -o wide 
# curl -vk https://$POD_IP:5000/healthz

Related Articles:
- image-registry operator reports that it is degraded due to StorageNotConfigured
- Failed to Configure NFS Share for image-registry Storage

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments