NVIDIA GPU operator daemonSet pod stuck in Init:ContainerCreating state in RHOCP4

Solution Verified - Updated -

Issue

  • The pods of the nvidia-driver-daemonset consistently remain stuck in the Init:ContainerCreating phase with multiple restarts:

    $ oc get pods
    NAME                                                  READY   STATUS                  RESTARTS   AGE
    nvidia-driver-daemonset-xxxx                          0/2     Init:CrashLoopBackOff   6          15m
    

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • NVIDIA GPU Operator

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content