How to configure AMD GPU for using in Podman containers on RHEL9?

Environment

Red Hat Enterprise Linux 9.4 (RHEL)
A machine with AMD GPU

Issue

AMD GPU configuration to use with Podman containers

Resolution

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

Add the Codeready builder repo

$ sudo subscription-manager repos --enable codeready-builder-for-rhel-9-$(arch)-rpms

Add AMD provided repos as described on AMD website. Please take care of using proper RHEL 9 minor version.

$ sudo tee /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/6.1.2/rhel/9.4/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
$ sudo yum clean all

$ sudo tee --append /etc/yum.repos.d/rocm.repo <<EOF
[ROCm-6.1.2]
name=ROCm6.1.2
baseurl=https://repo.radeon.com/rocm/rhel9/6.1.2/main
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF

$ sudo yum clean all

Install the driver and reboot the system

$ sudo yum install amdgpu-dkms
$ sudo reboot

Install ROCm packages
Raw
```
$ sudo yum install rocm
```
Configure SELinux to allow the containers to use the host system devices
Raw
```
$ sudo setsebool -P container_use_devices 1
```

To check GPU availability within the container ROCm provided Pytorch image can be used

$ podman run -it --device /dev/kfd --device /dev/dri --net=host --security-opt=no-new-privileges --cap-drop=ALL docker.io/rocm/pytorch:latest python3

>>> import torch;
>>> torch.cuda.is_available();
True
>>> torch.cuda.current_device();
0
>>> torch.cuda.get_device_name(0);
'AMD Radeon Graphics'

Root Cause

For some workloads data scientists or AI/ML engineers might prefer using GPU over CPU as it is more efficient. Additional configuration is required to use the GPU for containerized workload.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Ansible.com

Red Hat Ecosystem Catalog

Red Hat Hybrid Cloud Console

Red Hat Store

Red Hat Marketplace

Red Hat Summit and AnsibleFest

How to configure AMD GPU for using in Podman containers on RHEL9?

Environment

Issue

Resolution

Root Cause

Comments

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Environment

Issue

Resolution

Root Cause

Comments

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links