Issue with the GPU pass through From RHEL 9.4 OS to KVM

Posted on

Hello Team,

Greetings for the day!

This is regarding the GPU pass through issue from the Redhat 9.4 OS to KVM. I have installed the GPU drivers in the base redhat 9.4 OS and followed the below steps in passing GPU to KVM. However, when i am attaching the NVIDIA PCI device to a VM and starting the VM, it throwed an error that "Host doesn't not support pass through of host PCI devices." Moreover, Kernel driver in use is still showing as "Nvidia" instead of "vfio-pci".

Kindly provide your inputs on next steps for further troubleshooting.

1.) lspci -nn | grep -E "NVIDIA"
21:00.0 VGA compatible controller [0300]: NVIDIA Corporation **** [Quadro T400 Mobile] [****:****] (rev a1)
21:00.1 Audio device [0403]: NVIDIA Corporation Device [****:****] (rev a1)
2.) Added the vendor and device IDs in the /etc/default/grub file as per the below format.
rd.driver.blacklist=nouveau nouveau.modeset=0 intel_iommu=on iommu=pt vfio-pci.ids=****:****,****:****
Generated a new grub configuration to include the above changes
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
and Rebooted the machine.
3.) Created a new file as vfio.conf and added the vendor and device IDs with the below format.
[root@localhost ~]# sudo cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=****:****,****:****
softdep nvidia pre: vfio-pci
And regenerated the initramfs (initial RAM file system) for the currently running kernel version by executing the below command.
4.) [root@localhost ~]# sudo dracut -f --kver $(uname -r)
and rebooted the machine.
5.) [root@localhost ~]# lspci -k | grep -E "vfio-pci|NVIDIA"
21:00.0 VGA compatible controller: NVIDIA Corporation ******* [Quadro T400 Mobile] (rev a1)
Subsystem: Hewlett-Packard Company Device 1489
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
21:00.1 Audio device: NVIDIA Corporation Device **** (rev a1)
Subsystem: Hewlett-Packard Company Device ****
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
However, when i execute the below command i am not able to find that IOMMU is enabled.
6.) [root@localhost ~]# dmesg | grep -i -e DMAR -e IOMMU
[ 0.007309] ACPI: DMAR 0x0000****CA91000 ****** (v01 HPQOEM SLIC-WKS 00000001 INTL *******)
[ 0.007337] ACPI: Reserving DMAR table memory at [mem 0x6ca91000-0x6ca91117]
[ 0.073454] DMAR: Host address width 46
[ 0.073454] DMAR: DRHD base: 0x00****07fc000 flags: 0x0
[ 0.073460] DMAR: dmar0: reg_base_addr 90****00 ver 1:0 cap 8d****78****466 ecap f020df
[ 0.073462] DMAR: DRHD base: 0x000****00 flags: 0x0
[ 0.073465] DMAR: dmar1: reg_base_addr b3bfc000 ver 1:0 cap 8****06f0466 ecap f020df
[ 0.073466] DMAR: DRHD base: 0x000****fc000 flags: 0x0
[ 0.073470] DMAR: dmar2: reg_base_addr fbffc000 ver 1:0 cap 8d207****f0466 ecap f020df
[ 0.073472] DMAR: DRHD base: 0x00****fc000 flags: 0x1
[ 0.073474] DMAR: dmar3: reg_base_addr 903fc000 ver 1:0 cap 8d2078c****0466 ecap f020df
[ 0.073475] DMAR: RMRR base: 0x0000****3000 end: 0x0000006c1f5fff
[ 0.073477] DMAR: ATSR flags: 0x0
[ 0.073479] DMAR-IR: IOAPIC id 12 under DRHD base **** IOMMU 2
[ 0.073481] DMAR-IR: IOAPIC id 11 under DRHD base **** IOMMU 1
[ 0.073481] DMAR-IR: IOAPIC id 10 under DRHD base **** IOMMU 0
[ 0.073482] DMAR-IR: IOAPIC id 8 under DRHD base **** IOMMU 3
[ 0.073483] DMAR-IR: IOAPIC id 9 under DRHD base **** IOMMU 3
[ 0.073484] DMAR-IR: HPET id 0 under DRHD base ****
[ 0.073485] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.074487] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.286455] iommu: Default domain type: Translated
[ 0.286455] iommu: DMA domain TLB invalidation policy: lazy mode

Regards,
Raviteja Kundanala

Responses