When using Persistent Volumes with high file counts in OpenShift, why do pods fail to start or take an excessive amount of time to achieve "Ready" state?

Solution Verified - Updated -

Issue

  • When attaching volumes to pods in Red Hat OpenShift Container Platform, why do pods sometimes not start, or otherwise take an excessive amount of time to start?
  • The volumes themselves have very high file counts, measured often in tens of thousands of files and directories (or higher).
  • Starting the pods without the high file count volumes allows the pod to become "Ready" quickly (but without access to the data the volume provides).
  • It is possible that entire nodes sometimes are marked as "NotReady" due to this issue as the container runtime (docker or cri-o) is unresponsive (as seen with hung docker ps or crictl ps commands).
  • Pods not able to start falling into CreateContainerError status:

    # oc get pod
    NAME                    READY   STATUS                 RESTARTS   AGE
    mypod-5-1111a           0/1     CreateContainerError   0          7m29s
    
  • Pod deployments are failing with the following message: Error: Failed to create pod sandbox: rpc error: code = Unknown desc = Kubelet may be retrying requests that are timing out in CRI-O due to system load: context deadline exceeded

Environment

  • Red Hat OpenShift Container Platform 3
  • Red Hat OpenShift Container Platform 4.7+
  • Docker Container Engine
  • CRI-O Container Engine

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content