Red Hat OpenShift AI: Supported Configurations

Updated -

This article lists the Red Hat OpenShift AI (RHOAI) offerings (Self-Managed and Cloud Service), the RHOAI components, their current support phase, and their compatibility with the underlying platforms.

Red Hat OpenShift AI Self-Managed

You install OpenShift AI Self-Managed by installing the Red Hat OpenShift AI Operator and then configuring the Operator to manage standalone components of the product.
RHOAI Self-Managed is supported on OpenShift Container Platform running on x86_64, ppc64le, and s390x architectures. This includes the following providers:

  • Bare Metal
  • Hosted control planes on Bare Metal
  • IBM Cloud
  • Red Hat OpenStack
  • Amazon Web Services
  • Google Cloud Platform
  • Microsoft Azure
  • VMware vSphere
  • Oracle Cloud
  • IBM Power (Technology Preview)
  • IBM Z (Technology Preview)

This also includes support for RHOAI Self-Managed on managed OpenShift offerings such as OpenShift Dedicated, Red Hat OpenShift Service on AWS (ROSA with HCP), Red Hat OpenShift Service on AWS (classic architecture), and Microsoft Azure Red Hat OpenShift. Currently, RHOAI Self-Managed is not supported on OpenShift running on the ARM architecture and on other platforms such as OpenShift Kubernetes Engine, and MicroShift.
For a full overview of the RHOAI Self-Managed life cycle and the currently supported releases, visit this page.

x86_64 architecture

Operator Version Components OpenShift Version Chipset Architecture
CodeFlare Dashboard Data science pipelines Feature Store KServe Kubeflow Training Kuberay Kueue Llama Stack Model Mesh Serving Model Registry TrustyAI Workbenches
2.23 GA GA GA TP GA GA GA GA DP Deprecated TP GA GA 4.15, 4.16, 4.17, 4.18, 4.19 x86_64
2.22 GA GA GA TP GA GA GA GA - Deprecated TP GA GA 4.15, 4.16, 4.17, 4.18, 4.19 x86_64
2.19 GA GA GA - GA GA GA GA - Deprecated TP GA GA 4.14, 4.15, 4.16, 4.17, 4.18 x86_64
2.16 GA GA GA - GA TP GA GA - GA TP GA GA 4.14, 4.15, 4.16, 4.17 x86_64
2.8 TP EUS EOL (v1) - EUS - TP - - GA - - GA 4.12, 4.14, 4.15 x86_64

IBM Z (s390x) architecture

Operator Version Components OpenShift Version Chipset Architecture
CodeFlare Dashboard Data science pipelines Feature Store KServe Kubeflow Training Kuberay Kueue Llama Stack Model Mesh Serving Model Registry TrustyAI Workbenches
2.23 - TP - - TP - - - - - - - 4.18, 4.19 s390x
2.22 - TP - - TP - - - - - - - 4.18, 4.19 s390x
2.19 - TP - - TP - - - - - - - 4.18 s390x

IBM Power (ppc64le) architecture

Operator Version Components OpenShift Version Chipset Architecture
CodeFlare Dashboard Data science pipelines Feature Store KServe Kubeflow Training Kuberay Kueue Llama Stack Model Mesh Serving Model Registry TrustyAI Workbenches
2.23 - TP - - TP - - - - - - - - 4.18, 4.19 ppc64le
2.22 - TP - - TP - - - - - - - - 4.18, 4.19 ppc64le
2.19 - TP - - TP - - - - - - - - 4.18 ppc64le

Red Hat OpenShift AI Cloud Service

You install OpenShift AI Cloud Service by installing the Red Hat OpenShift AI Add-on and then using the add-on to manage standalone components of the product. The add-on has a single version, reflecting the latest update of the cloud service.
RHOAI Cloud Service is supported on OpenShift Dedicated (AWS and GCP) and on Red Hat OpenShift Service on AWS (classic architecture). Currently, RHOAI Cloud Service is not supported on Microsoft Azure Red Hat OpenShift and platform services such as ROSA with HCP.
For a full overview of the RHOAI Cloud Service life cycle and the currently supported releases, visit this page.

Add-on Version Components OpenShift Version Chipset Architecture
CodeFlare Dashboard Data science pipelines Feature Store KServe Kubeflow Training Kuberay Kueue Llama Stack Model Mesh Serving Model Registry TrustyAI Workbenches
2.23 GA GA GA TP GA GA GA GA DP Deprecated TP GA GA 4.15, 4.16, 4.17, 4.18, 4.19 x86_64



TP: Technology Preview
DP: Developer Preview
Developer and Technology Previews: How they compare
LA: Limited Availability. During this phase you can install and receive support for the feature only with specific approval from Red Hat. Without such approval, the feature is unsupported.
GA: General Availability.
EUS: Extended Update Support. During the EUS phase, Red Hat will maintain component specific support.
EOL: End of Life. During this phase, the component will no longer be supported.

RHOAI and vLLM version compatibility

The following table shows the version of the vLLM model-serving runtime that is included with each version of Red Hat OpenShift AI.

RHOAI Version vLLM CUDA vLLM ROCm vLLM Power/Z vLLM Gaudi
RHOAI-2.24 v0.10.0.2 v0.10.0.2 v0.10.0.2 v0.8.5+Gaudi-1.21.3
RHOAI-2.23 v0.9.2.1 v0.9.2.1 v0.9.2.1 v0.7.2+Gaudi-1.21.0
RHOAI-2.22 v0.9.1.0 v0.8.4.3 v0.9.1.0 v0.7.2+Gaudi-1.21.0
RHOAI-2.21 v0.9.0.1 v0.8.4.3 v0.8.5 v0.6.4.post2+Gaudi-1.19.0
RHOAI-2.20 v0.8.4.0 v0.8.4.0 v0.8.4.0 v0.6.4.post2+Gaudi-1.19.0
RHOAI-2.19 v0.8.4.0 v0.8.4.0 v0.8.4.0 v0.7.2+Gaudi-1.21.0
RHOAI-2.18 v0.7.1 v0.7.1 v0.6.4.post2+Gaudi-1.19.2
RHOAI-2.17 v0.6.6.post1 v0.6.6.post1 v0.6.4.post2+Gaudi-1.19.0
RHOAI-2.16 v0.6.3.post1 v0.6.3.post1 v0.6.6.post1+Gaudi-1.20.0
RHOAI-2.15 v0.6.3
RHOAI-2.14 v0.6.2
RHOAI-2.13 v0.6.2
RHOAI-2.12 0.5.3.post1
RHOAI-2.11 0.5.0.post1

Red Hat OpenShift AI Operator Dependencies

For information on the compatibility and supported versions of Red Hat OpenShift AI Operator dependencies, see the following documentation:

Currently, OpenShift Service Mesh v3 is not supported.

Currently, Red Hat - Authorino Operator is the only Red Hat Connectivity Link component that is supported in Red Hat OpenShift AI. To install or upgrade the Red Hat - Authorino Operator, follow the instructions in the Red Hat OpenShift AI documentation.

In Red Hat OpenShift AI, the CodeFlare Operator is included in the base product and not in a separate Operator. Separately installed instances of the CodeFlare Operator from Red Hat or the community are not supported. For more information, see the Red Hat Knowledgebase solution How to migrate from a separately installed CodeFlare Operator in your data science cluster.

Red Hat OpenShift AI does not directly support any specific accelerators. To use accelerator functionality in OpenShift AI, the relevant accelerator Operators are required. OpenShift AI supports integration with the relevant Operators, and provides many images across the product that include the libraries to work with NVIDIA GPUs, AMD GPUs, and Intel Gaudi AI accelerators. For more information about which devices are supported by an Operator, see the documentation for that Operator.

Support requirements and limitations

Review this section to understand the requirements for Red Hat support and any limitations to Red Hat support of Red Hat OpenShift AI.

Supported browsers

  • Google Chrome
  • Mozilla Firefox
  • Safari

Supported services

Red Hat OpenShift AI supports the following services:

Service Name Description
Elasticsearch Elasticsearch is a distributed, RESTful search and analytics engine. It centrally stores data for lightning fast search, fine‑tuned relevancy, and powerful analytics that scale with ease.
IBM Watson Studio IBM Watson Studio is a platform for embedding AI and machine learning into your business and creating custom models with your own data.
Intel® oneAPI AI Analytics Toolkit Container The AI Kit is a set of AI software tools to accelerate end-to-end data science and analytics pipelines on Intel® architectures.
Jupyter Jupyter is a multi-user version of the notebook designed for companies, classrooms, and research labs.
OpenVINO OpenVINO is an open source toolkit to help optimize deep learning performance and deploy using an inference engine onto Intel hardware.
Pachyderm Use Pachyderm’s data versioning, pipeline and lineage capabilities to automate the machine learning life cycle and optimize machine learning operations.
Starburst Enterprise Starburst Enterprise platform (SEP) is the commercial distribution of Trino, which is an open-source, Massively Parallel Processing (MPP) ANSI SQL query engine.

Supported workbench images

The latest supported workbench images in Red Hat OpenShift AI are installed with Python by default.
You can install packages that are compatible with the supported version of Python on any workbench server that has the binaries required by that package. If the required binaries are not included on the workbench image you want to use, contact Red Hat Support to request that the binary be considered for inclusion.

To provide a consistent, stable platform for your model development, select workbench images that contain the same version of Python. Workbench images available on OpenShift AI are pre-built and ready for you to use immediately after OpenShift AI is installed or upgraded.
Workbench images are supported for a minimum of one year. Major updates to pre-configured workbench images occur about every six months. Therefore, two supported workbench image versions are typically available at any given time. You can use this support period to update your code to use components from the latest available workbench image. Legacy workbench image versions, that is, not the two most recent versions, might still be available for selection. Legacy image versions include a label that indicates the image is out-of-date. To use the latest package versions, Red Hat recommends that you use the most recently added workbench image. If necessary, you can still access older workbench images from the registry, even if they are no longer supported. You can then add the older workbench images as custom workbench images to cater for your project’s specific requirements.

Workbench images denoted with Technology Preview in the following table are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using Technology Preview features in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

Image name Image version Preinstalled packages
Code Server | Data Science | CPU | Python 3.11 2025.1 (Recommended) code-server 4.98, Python 3.11, Boto3: 1.37, Kafka-Python-ng: 2.2, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Sklearn-onnx: 1.18, ipykernel: 6.29, Kubeflow-Training: 1.9
2024.2 code-server 4.92, Python 3.11, Boto3: 1.34, Kafka-Python: 2.0, Matplotlib: 3.8, Numpy: 1.26, Pandas: 2.2, Scikit-learn: 1.4, Scipy: 1.12, Sklearn-onnx: 1.16, ipykernel: 6.29, Kubeflow-Training: 1.8
Code Server | Data Science | CPU | Python 3.12 2025.1 code-server 4.98, Python 3.12, Boto3: 1.37, Kafka-Python-ng: 2.2, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Sklearn-onnx: 1.18, ipykernel: 6.29, Kubeflow-Training: 1.9
Jupyter | Data Science | CPU | Python 3.11 2025.1 (Recommended) Python 3.11, JupyterLab: 4.4, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.30, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
2024.2 Python 3.11, JupyterLab: 4.2, Boto3: 1.35, Kafka-Python-ng: 2.2, Kfp: 2.9, Matplotlib: 3.9, Numpy: 2.1, Pandas: 2.2, Scikit-learn: 1.5, Scipy: 1.14, Odh-Elyra: 4.2, PyMongo: 4.8, Pyodbc: 5.1, Codeflare-SDK: 0.26, Sklearn-onnx: 1.17, Psycopg: 3.2, MySQL Connector/Python: 9.0, Kubeflow-Training: 1.8
Jupyter | Data Science | CPU | Python 3.12 2025.1 Python 3.12, JupyterLab: 4.4, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.29, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
Jupyter | Minimal | CPU | Python 3.11 2025.1 (Recommended) Python 3.11, JupyterLab: 4.4
2024.2 Python 3.11, JupyterLab: 4.2
Jupyter | Minimal | CPU | Python 3.12 2025.1 Python 3.12, JupyterLab: 4.4
Jupyter | Minimal | CUDA | Python 3.11 2025.1 (Recommended) CUDA 12.6, Python 3.11, JupyterLab: 4.4
2024.2 CUDA 12.4, Python 3.11, JupyterLab: 4.2
Jupyter | Minimal | CUDA | Python 3.12 2025.1 CUDA 12.6, Python 3.12, JupyterLab: 4.4
Jupyter | Minimal | ROCm | Python 3.11 2025.1 (Recommended) ROCm 6.2, Python 3.11, JupyterLab: 4.4
2024.2 ROCm 6.1, Python 3.11, JupyterLab: 4.2
Jupyter | Minimal | ROCm | Python 3.12 2025.1 ROCm 6.2, Python 3.12, JupyterLab: 4.4
Jupyter | PyTorch LLM Compressor | CUDA | Python 3.12 2025.1 CUDA 12.6, Python 3.12, JupyterLab: 4.4, PyTorch: 2.6, LLM-Compressor: 0.6, Tensorboard: 2.19, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 1.26, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.29, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
Jupyter | PyTorch | CUDA | Python 3.11 2025.1 (Recommended) CUDA 12.6, Python 3.11, JupyterLab: 4.4, PyTorch: 2.6, Tensorboard: 2.19, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.30, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
2024.2 CUDA 12.4, Python 3.11, JupyterLab: 4.2, PyTorch: 2.4, Tensorboard: 2.17, Boto3: 1.35, Kafka-Python-ng: 2.2, Kfp: 2.9, Matplotlib: 3.9, Numpy: 2.1, Pandas: 2.2, Scikit-learn: 1.5, Scipy: 1.14, Odh-Elyra: 4.2, PyMongo: 4.8, Pyodbc: 5.1, Codeflare-SDK: 0.26, Sklearn-onnx: 1.17, Psycopg: 3.2, MySQL Connector/Python: 9.0, Kubeflow-Training: 1.8
Jupyter | PyTorch | CUDA | Python 3.12 2025.1 CUDA 12.6, Python 3.12, JupyterLab: 4.4, PyTorch: 2.6, Tensorboard: 2.19, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.29, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
Jupyter | PyTorch | ROCm | Python 3.11 2025.1 (Recommended) Python 3.11, JupyterLab: 4.4, ROCm-PyTorch: 2.6, Tensorboard: 2.18, Kafka-Python-ng: 2.2, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.30, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
2024.2 Python 3.11, JupyterLab: 4.2, ROCm-PyTorch: 2.4, Tensorboard: 2.16, Kafka-Python-ng: 2.2, Matplotlib: 3.9, Numpy: 2.1, Pandas: 2.2, Scikit-learn: 1.5, Scipy: 1.14, Odh-Elyra: 4.2, PyMongo: 4.8, Pyodbc: 5.1, Codeflare-SDK: 0.26, Sklearn-onnx: 1.17, Psycopg: 3.2, MySQL Connector/Python: 9.0, Kubeflow-Training: 1.8
Jupyter | PyTorch | ROCm | Python 3.12 2025.1 Python 3.12, JupyterLab: 4.4, ROCm-PyTorch: 2.6, Tensorboard: 2.18, Kafka-Python-ng: 2.2, Matplotlib: 3.10, Numpy: 2.2, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.29, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
Jupyter | TensorFlow | CUDA | Python 3.11 2025.1 (Recommended) CUDA 12.6, Python 3.11, JupyterLab: 4.4, TensorFlow: 2.18, Tensorboard: 2.18, Nvidia-CUDA-CU12-Bundle: 12.5, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 1.26, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.30, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3
2024.2 CUDA 12.4, Python 3.11, JupyterLab: 4.2, TensorFlow: 2.17, Tensorboard: 2.17, Nvidia-CUDA-CU12-Bundle: 12.3, Boto3: 1.35, Kafka-Python-ng: 2.2, Kfp: 2.5, Matplotlib: 3.9, Numpy: 1.26, Pandas: 2.2, Scikit-learn: 1.5, Scipy: 1.14, Odh-Elyra: 4.2, PyMongo: 4.8, Pyodbc: 5.1, Codeflare-SDK: 0.24, Sklearn-onnx: 1.17, Psycopg: 3.2, MySQL Connector/Python: 9.0
Jupyter | TensorFlow | CUDA | Python 3.12 2025.1 CUDA 12.6, Python 3.12, JupyterLab: 4.4, TensorFlow: 2.19, Tensorboard: 2.19, Nvidia-CUDA-CU12-Bundle: 12.5, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 1.26, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.29, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3
Jupyter | TensorFlow | ROCm | Python 3.11 2025.1 (Recommended) Python 3.11, JupyterLab: 4.4, ROCm-TensorFlow: 2.14, Tensorboard: 2.14, Kafka-Python-ng: 2.2, Matplotlib: 3.10, Numpy: 1.26, Pandas: 2.2, Scikit-learn: 1.6, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.30, Sklearn-onnx: 1.17, Psycopg: 3.2, MySQL Connector/Python: 9.3
2024.2 Python 3.11, JupyterLab: 4.2, ROCm-TensorFlow: 2.14, Tensorboard: 2.14, Kafka-Python-ng: 2.2, Matplotlib: 3.9, Numpy: 1.26, Pandas: 2.2, Scikit-learn: 1.5, Scipy: 1.14, Odh-Elyra: 4.2, PyMongo: 4.8, Pyodbc: 5.1, Codeflare-SDK: 0.24, Sklearn-onnx: 1.17, Psycopg: 3.2, MySQL Connector/Python: 9.0
Jupyter | TrustyAI | CPU | Python 3.11 2025.1 (Recommended) Python 3.11, JupyterLab: 4.4, TrustyAI: 0.6, Transformers: 4.53, Datasets: 3.4, Accelerate: 1.5, Torch: 2.6, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.6, Numpy: 1.24, Pandas: 1.5, Scikit-learn: 1.5, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.30, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9
2024.2 Python 3.11, JupyterLab: 4.2, TrustyAI: 0.6, Transformers: 4.38, Datasets: 2.21, Accelerate: 0.34, Torch: 2.2, Boto3: 1.35, Kafka-Python-ng: 2.2, Kfp: 2.9, Matplotlib: 3.6, Numpy: 1.24, Pandas: 1.5, Scikit-learn: 1.2, Scipy: 1.14, Odh-Elyra: 4.2, PyMongo: 4.8, Pyodbc: 5.1, Codeflare-SDK: 0.26, Sklearn-onnx: 1.17, Psycopg: 3.2, MySQL Connector/Python: 9.0, Kubeflow-Training: 1.8
Jupyter | TrustyAI | CPU | Python 3.12 2025.1 Python 3.12, JupyterLab: 4.4, TrustyAI: 0.6, Transformers: 4.55, Datasets: 3.4, Accelerate: 1.5, Torch: 2.6, Boto3: 1.37, Kafka-Python-ng: 2.2, Kfp: 2.12, Matplotlib: 3.10, Numpy: 1.26, Pandas: 1.5, Scikit-learn: 1.7, Scipy: 1.15, Odh-Elyra: 4.2, PyMongo: 4.11, Pyodbc: 5.2, Codeflare-SDK: 0.29, Sklearn-onnx: 1.18, Psycopg: 3.2, MySQL Connector/Python: 9.3, Kubeflow-Training: 1.9

Supported model-serving runtimes

Runtime name Description Exported model format
Caikit Text Generation Inference Server (Caikit-TGIS) ServingRuntime for KServe (1) A composite runtime for serving models in the Caikit format Caikit Text Generation
Caikit Standalone ServingRuntime for KServe (2) A runtime for serving models in the Caikit embeddings format for embeddings tasks Caikit Embeddings
OpenVINO Model Server A scalable, high-performance runtime for serving models that are optimized for Intel architectures PyTorch, TensorFlow, OpenVINO IR, PaddlePaddle, MXNet, Caffe, Kaldi
[Deprecated] Text Generation Inference Server (TGIS) Standalone ServingRuntime for KServe (3) A runtime for serving TGI-enabled models PyTorch Model Formats
vLLM NVIDIA GPU ServingRuntime for KServe A high-throughput and memory-efficient inference and serving runtime for large language models that supports NVIDIA GPU accelerators Supported models
vLLM Intel Gaudi Accelerator ServingRuntime for KServe A high-throughput and memory-efficient inference and serving runtime that supports Intel Gaudi accelerators Supported models
vLLM AMD GPU ServingRuntime for KServe A high-throughput and memory-efficient inference and serving runtime that supports AMD GPU accelerators Supported models
CPU ServingRuntime for KServe A high-throughput and memory-efficient inference and serving runtime that supports IBM Power (ppc64le) and IBM Z (s390x) Supported models

(1) The composite Caikit-TGIS runtime is based on Caikit and Text Generation Inference Server (TGIS). To use this runtime, you must convert your models to Caikit format. For an example, see Converting Hugging Face Hub models to Caikit format in the caikit-tgis-serving repository.

(2) The Caikit Standalone runtime is based on Caikit NLP. To use this runtime, you must convert your models to the Caikit embeddings format. For an example, see Tests for text embedding module.

(3) The Text Generation Inference Server (TGIS) Standalone ServingRuntime for KServe is deprecated. For more information, see Red Hat OpenShift AI release notes.

Deployment requirements for supported model-serving runtimes

Runtime name Default protocol Additonal protocol Model mesh support Single node OpenShift support Deployment mode
Caikit Text Generation Inference Server (Caikit-TGIS) ServingRuntime for KServe REST gRPC No Yes Raw and serverless
Caikit Standalone ServingRuntime for KServe REST gRPC No Yes Raw and serverless
OpenVINO Model Server REST None Yes Yes Raw and serverless
[Deprecated] Text Generation Inference Server (TGIS) Standalone ServingRuntime for KServe gRPC None No Yes Raw and serverless
vLLM NVIDIA GPU ServingRuntime for KServe REST None No Yes Raw and serverless
vLLM Intel Gaudi Accelerator ServingRuntime for KServe REST None No Yes Raw and serverless
vLLM AMD GPU ServingRuntime for KServe REST None No Yes Raw and serverless
vLLM CPU ServingRuntime for KServe (1) REST None No Yes Raw

(1) For vLLM CPU ServingRuntime for KServe, if you are using IBM Z and IBM Power architecture, you can only deploy models in standard deployment mode.

Tested and verified model-serving runtimes

Name Description Exported model format
NVIDIA Triton Inference Server An open-source inference-serving software for fast and scalable AI in applications. TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more
Seldon MLServer An open-source inference server designed to simplify the deployment of machine learning models. Scikit-Learn (sklearn), XGBoost, LightGBM, CatBoost, HuggingFace and MLflow

Deployment requirements for tested and verified model-serving runtimes

Name Default protocol Additional protocol Model mesh support Single node OpenShift support Deployment mode
NVIDIA Triton Inference Server gRPC REST Yes Yes Standard and advanced
Seldon MLServer gRPC REST No Yes Standard and advanced

Note: The alibi-detect and alibi-explain libraries from Seldon are under the Business Source License 1.1 (BSL 1.1). These libraries are not tested, verified, or supported by {org-name} as part of the certified Seldon MLServer runtime. It is not recommended that you use these libraries in production environments with the runtime.

Training images

To run distributed training jobs in OpenShift AI, you can use one of the following types of training image:

  • A Ray-based training image that is tested and verified for the documented use cases and configurations
  • A training image that Red Hat supports for use with the Kubeflow Training Operator (KFTO)

Ray-based training images

The following table provides information about the latest available Ray-based training images in Red Hat OpenShift AI. These images are AMD64 images, which might not work on other architectures.

You can use the provided images as base images, and install additional packages to create custom images, as described in the product documentation. If the required packages are not included in the training image you want to use, contact Red Hat Support to request that the package be considered for inclusion.

The images are updated periodically with new versions of the installed packages. These images have been tested and verified for the use cases and configurations that are documented in the corresponding product documentation. Bug fixes and CVE fixes are delivered after they are available in upstream packages, in newer versions of these images only; fixes are not backported to earlier image versions.

Image type RHOAI version Image version URL Preinstalled packages
CUDA 2.23 2.47.1-py311-cu121 quay.io/modh/ray:2.47.1-py311-cu121 Ray 2.47.1, CUDA 12.1, Python 3.11
2.22 2.46.0-py311-cu121 quay.io/modh/ray:2.46.0-py311-cu121 Ray 2.46.0, CUDA 12.1, Python 3.11
2.16, 2.19 2.35.0-py311-cu121 quay.io/modh/ray:2.35.0-py311-cu121 Ray 2.35, CUDA 12.1, Python 3.11
2.35.0-py39-cu121 quay.io/modh/ray:2.35.0-py39-cu121 Ray 2.35, CUDA 12.1, Python 3.9
2.8 latest-py39-cu118 quay.io/project-codeflare/ray:latest-py39-cu118 Ray 2.7.1, CUDA 11.8, Python 3.9
Ray ROCm 2.23 2.47.1-py311-rocm62 quay.io/modh/ray:2.47.1-py311-rocm62 Ray 2.47.1, ROCm 6.2, Python 3.11
2.22 2.46.0-py311-rocm62 quay.io/modh/ray:2.46.0-py311-rocm62 Ray 2.46.0, ROCm 6.2, Python 3.11
2.19 2.35.0-py311-rocm62 quay.io/modh/ray:2.35.0-py311-rocm62 Ray 2.35, ROCm 6.2, Python 3.11
2.35.0-py39-rocm62 quay.io/modh/ray:2.35.0-py39-rocm62 Ray 2.35, ROCm 6.2, Python 3.9
2.16 2.35.0-py311-rocm61 quay.io/modh/ray:2.35.0-py311-rocm61 Ray 2.35, ROCm 6.1, Python 3.11
2.35.0-py39-rocm61 quay.io/modh/ray:2.35.0-py39-rocm61 Ray 2.35, ROCm 6.1, Python 3.9

Training images for use with KFTO

The following table provides information about the training images that Red Hat supports for use with the Kubeflow Training Operator (KFTO) in Red Hat OpenShift AI. These images are AMD64 images, which might not work on other architectures.

You can use the provided images as base images, and install additional packages to create custom images, as described in the product documentation.

Image type RHOAI version Image version URL Preinstalled packages
CUDA 2.16, 2.19, 2.22 py311-cuda124-torch251 quay.io/modh/training:py311-cuda124-torch251 CUDA 12.4, Python 3.11, PyTorch 2.5.1
py311-cuda121-torch241 quay.io/modh/training:py311-cuda121-torch241 CUDA 12.1, Python 3.11, PyTorch 2.4.1
ROCm 2.16, 2.19, 2.22 py311-rocm62-torch251 quay.io/modh/training:py311-rocm62-torch251 ROCm 6.2, Python 3.11, PyTorch 2.5.1
py311-rocm62-torch241 quay.io/modh/training:py311-rocm62-torch241 ROCm 6.2, Python 3.11, PyTorch 2.4.1
2.16, 2.19 py311-rocm61-torch241 quay.io/modh/training:py311-rocm61-torch241 ROCm 6.1, Python 3.11, PyTorch 2.4.1

Comments