Image pulls sometimes fails with "i/o timeout" on OpenShift nodes
Issue
- Intermittently, pulling an image from the Red Hat Container Registry fails with an
i/o timeouterror. -
Even when manually pulling a container image using
skopeo, we see the same error:$ skopeo --debug inspect docker://registry.redhat.io/openshift3/ose-pod:v3.11 DEBU[0000] reference rewritten from 'registry.redhat.io/openshift3/ose-pod:v3.11' to 'registry.redhat.io/openshift3/ose-pod:v3.11' DEBU[0000] Trying to pull "registry.redhat.io/openshift3/ose-pod:v3.11" DEBU[0000] Returning credentials from /root/.docker/config.json DEBU[0000] Using registries.d directory /etc/containers/registries.d for sigstore configuration DEBU[0000] Using "default-docker" configuration DEBU[0000] No signature storage configuration found for registry.redhat.io/openshift3/ose-pod:v3.11 DEBU[0000] Looking for TLS certificates and private keys in /etc/docker/certs.d/registry.redhat.io DEBU[0000] GET https://registry.redhat.io/v2/ DEBU[0000] Ping https://registry.redhat.io/v2/ status 401 DEBU[0000] GET https://registry.redhat.io/auth/realms/rhcc/protocol/redhat-docker-v2/auth?account=example&scope=repository%3Aopenshift3%2Fose-pod%3Apull&service=docker-registry DEBU[0001] GET https://registry.redhat.io/v2/openshift3/ose-pod/manifests/v3.11 DEBU[0001] GET https://registry.redhat.io/v2/openshift3/ose-pod/manifests/sha256:b40212147173e580b997654e68c41720e0ec9c588f5b71bad78aa5cc5b514678 DEBU[0002] Downloading /v2/openshift3/ose-pod/blobs/sha256:3ecf70fa97ed935771a86af924d074082be79211cb3c4cf0fc4c9a5fe5841efa DEBU[0002] GET https://registry.redhat.io/v2/openshift3/ose-pod/blobs/sha256:3ecf70fa97ed935771a86af924d074082be79211cb3c4cf0fc4c9a5fe5841efa FATA[0032] Get https://registry.redhat.io/v2/openshift3/ose-pod/blobs/sha256:3ecf70fa97ed935771a86af924d074082be79211cb3c4cf0fc4c9a5fe5841efa: dial tcp 10.0.0.2:443: i/o timeout -
Only a single node facing image pull back errors, rest of the nodes do not observe any image pull issue:
DEBU[0030] Ping https://public.ecr.aws/v2/ err Get "https://public.ecr.aws/v2/": dial tcp xx.xx.xx.xx:443: i/o timeout (&url.Error{Op:"Get", URL:"https://public.ecr.aws/v2/", Err:(*net.OpError)(0xc000b73770)}) DEBU[0030] GET https://public.ecr.aws/v1/_ping DEBU[0060] Ping https://public.ecr.aws/v1/_ping err Get "https://public.ecr.aws/v1/_ping": dial tcp xx.xx.xx.xx:443: i/o timeout (&url.Error{Op:"Get", URL:"https://public.ecr.aws/v1/_ping", Err:(*net.OpError)(0xc0009b3ef0)})
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3.11
- 4
- Red Hat Container Registry (registry.redhat.io)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.