Workaround for model deployment failure when using hardware profiles
Issue
Model deployments that use hardware profiles fail because the Red Hat OpenShift AI Operator does not inject the tolerations
, nodeSelector
, or identifiers
from the hardware profile into the underlying InferenceService
when manually creating InferenceService
resources. As a result, the model deployment pods cannot be scheduled to suitable nodes and the deployment fails to enter a ready state. Workbenches that use the same hardware profile deploy successfully.
Environment
OpenShift AI 2.23
OpenShift AI 2.24
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.