NVIDIA GPU operator validator pod fails on OpenShift 4 using crun

Solution Verified - Updated -

Issue

  • When using the NVIDIA GPU Operator v25.3.2 on Red Hat OpenShift Container Platform 4.18.22 -or- 4.18.23, the nvidia-operator-validator fails with the following initialization error message:

    The nvidia-operator-validator pod in Init:CreateContainerError - error executing hook `/usr/local/nvidia/toolkit/nvidia-container-runtime-hook` (exit code: 1)  
    

    This issue will not be present on clusters before OpenShift Container Platform 4.18.22 or on clusters running above 4.18.23. They are only these two versions that are impacted.

  • The impact is only related to clusters where the container runtime is crun. Clusters that use runc are not impacted.

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4.18.22
    • 4.18.23
  • NVIDIA GPU Operator
  • crun

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content