[RHOCP 4] Pods stuck in CreateContainerError because requests are timing out in the CRI-O due to system load

Solution Verified - Updated -

Issue

  • Pods get stuck into the ContainerCreating state with the below error:

     liveness probes fail (Example: Liveness probe failed: Get "https://x.x.x.x:8443/actuator/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers) )
    
  • Kubelet logs on the issue node show requests are timing out in CRI-O due to system load:

    E0103 05:27:32.784317    3571 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"....\" with CreateContainerError: \"Kubelet may be retrying requests that are timing out in CRI-O due to system load: context deadline exceeded: error reserving ctr name xxxxxxxx for id cddf2233764f814833e691820903408bb59f3ad1e4e3e00a2fda1dd33cf0bba1: name is reserved\"" pod="xxxx" podUID=yyyy
    

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content