Workaround for model deployment failure when using hardware profiles
Issue
Model deployments that use hardware profiles fail because the Red Hat OpenShift AI Operator does not inject the tolerations, nodeSelector, or identifiers from the hardware profile into the underlying InferenceService when manually creating InferenceService resources. As a result, the model deployment pods cannot be scheduled to suitable nodes and the deployment fails to enter a ready state. Workbenches that use the same hardware profile deploy successfully.
Environment
OpenShift AI 2.23
OpenShift AI 2.24
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.