Image pulls sometimes fails with "i/o timeout" on OpenShift nodes

Solution Verified - Updated -

Issue

  • Intermittently, pulling an image from the Red Hat Container Registry fails with an i/o timeout error.
  • Even when manually pulling a container image using skopeo, we see the same error:

    $ skopeo --debug inspect docker://registry.redhat.io/openshift3/ose-pod:v3.11
    DEBU[0000] reference rewritten from 'registry.redhat.io/openshift3/ose-pod:v3.11' to 'registry.redhat.io/openshift3/ose-pod:v3.11' 
    DEBU[0000] Trying to pull "registry.redhat.io/openshift3/ose-pod:v3.11" 
    DEBU[0000] Returning credentials from /root/.docker/config.json 
    DEBU[0000] Using registries.d directory /etc/containers/registries.d for sigstore configuration 
    DEBU[0000]  Using "default-docker" configuration        
    DEBU[0000]  No signature storage configuration found for registry.redhat.io/openshift3/ose-pod:v3.11 
    DEBU[0000] Looking for TLS certificates and private keys in /etc/docker/certs.d/registry.redhat.io 
    DEBU[0000] GET https://registry.redhat.io/v2/           
    DEBU[0000] Ping https://registry.redhat.io/v2/ status 401 
    DEBU[0000] GET https://registry.redhat.io/auth/realms/rhcc/protocol/redhat-docker-v2/auth?account=example&scope=repository%3Aopenshift3%2Fose-pod%3Apull&service=docker-registry 
    DEBU[0001] GET https://registry.redhat.io/v2/openshift3/ose-pod/manifests/v3.11 
    DEBU[0001] GET https://registry.redhat.io/v2/openshift3/ose-pod/manifests/sha256:b40212147173e580b997654e68c41720e0ec9c588f5b71bad78aa5cc5b514678 
    DEBU[0002] Downloading /v2/openshift3/ose-pod/blobs/sha256:3ecf70fa97ed935771a86af924d074082be79211cb3c4cf0fc4c9a5fe5841efa 
    DEBU[0002] GET https://registry.redhat.io/v2/openshift3/ose-pod/blobs/sha256:3ecf70fa97ed935771a86af924d074082be79211cb3c4cf0fc4c9a5fe5841efa 
    FATA[0032] Get https://registry.redhat.io/v2/openshift3/ose-pod/blobs/sha256:3ecf70fa97ed935771a86af924d074082be79211cb3c4cf0fc4c9a5fe5841efa: dial tcp 10.0.0.2:443: i/o timeout
    
  • Only a single node facing image pull back errors, rest of the nodes do not observe any image pull issue:

    DEBU[0030] Ping https://public.ecr.aws/v2/ err Get "https://public.ecr.aws/v2/": dial tcp xx.xx.xx.xx:443: i/o timeout (&url.Error{Op:"Get", URL:"https://public.ecr.aws/v2/", Err:(*net.OpError)(0xc000b73770)})
    DEBU[0030] GET https://public.ecr.aws/v1/_ping
    DEBU[0060] Ping https://public.ecr.aws/v1/_ping err Get "https://public.ecr.aws/v1/_ping": dial tcp xx.xx.xx.xx:443: i/o timeout (&url.Error{Op:"Get", URL:"https://public.ecr.aws/v1/_ping", Err:(*net.OpError)(0xc0009b3ef0)})
    

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 3.11
    • 4
  • Red Hat Container Registry (registry.redhat.io)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content