[RHOCP 4] Pods stuck in CreateContainerError because requests are timing out in the CRI-O due to system load
Issue
-
Pods get stuck into the
ContainerCreating
state with the below error:liveness probes fail (Example: Liveness probe failed: Get "https://x.x.x.x:8443/actuator/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers) )
-
Kubelet
logs on the issue node show requests are timing out inCRI-O
due to system load:E0103 05:27:32.784317 3571 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"....\" with CreateContainerError: \"Kubelet may be retrying requests that are timing out in CRI-O due to system load: context deadline exceeded: error reserving ctr name xxxxxxxx for id cddf2233764f814833e691820903408bb59f3ad1e4e3e00a2fda1dd33cf0bba1: name is reserved\"" pod="xxxx" podUID=yyyy
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.