16.7. Assigning GPU Devices

To assign a GPU to a guest, use one of the following method:
  • GPU PCI Device Assignment - Using this method, it is possible to remove a GPU device from the host and assign it to a single guest.
  • NVIDIA vGPU Assignment - This method makes it possible to create multiple mediated devices from a physical GPU, and assign these devices as virtual GPUs to multiple guests. This is only supported on selected NVIDIA GPUs, and only one mediated device can be assigned to a single guest.

16.7.1. GPU PCI Device Assignment

Red Hat Enterprise Linux 7 supports PCI device assignment of the following PCIe-based GPU devices as non-VGA graphics devices:
  • NVIDIA Quadro K-Series, M-Series, and P-Series (models 2000 series or higher)
  • NVIDIA GRID K-Series
  • NVIDIA Tesla K-Series and M-Series
Currently, up to two GPUs may be attached to the virtual machine, in addition to one of the standard emulated VGA interfaces. The emulated VGA is used for pre-boot and installation and the NVIDIA GPU takes over when the NVIDIA graphics drivers are loaded.
To assign a GPU to a guest virtual machine, you must enable the I/O Memory Management Unit (IOMMU) on the host machine, identify the GPU device by using the lspci command, detach the device from host, attach it to the guest, and configure Xorg on the guest - as described in the following procedures:

Procedure 16.13. Enable IOMMU support in the host machine kernel

  1. Edit the kernel command line

    For an Intel VT-d system, IOMMU is activated by adding the intel_iommu=on and iommu=pt parameters to the kernel command line. For an AMD-Vi system, the option needed is amd_iommu=pt. To enable this option, edit or add the GRUB_CMDLINX_LINUX line to the /etc/sysconfig/grub configuration file as follows:
    GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
    vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
    vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ]  &&
    /usr/sbin/rhcrashkernel-param || :) rhgb quiet intel_iommu=on iommu=pt"
    

    Note

    For further information on IOMMU, see Appendix E, Working with IOMMU Groups.
  2. Regenerate the boot loader configuration

    For the changes to the kernel command line to apply, regenerate the boot loader configuration using the grub2-mkconfig command:
    # grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  3. Reboot the host

    For the changes to take effect, reboot the host machine:
    # reboot

Procedure 16.14. Excluding the GPU device from binding to the host physical machine driver

For GPU assignment, it is recommended to exclude the device from binding to host drivers, as these drivers often do not support dynamic unbinding of the device.
  1. Identify the PCI bus address

    To identify the PCI bus address and IDs of the device, run the following lspci command. In this example, a VGA controller such as an NVIDIA Quadro or GRID card is used:
    # lspci -Dnn | grep VGA
    0000:02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK106GL [Quadro K4000] [10de:11fa] (rev a1)
    
    The resulting search reveals that the PCI bus address of this device is 0000:02:00.0 and the PCI IDs for the device are 10de:11fa.
  2. Prevent the native host machine driver from using the GPU device

    To prevent the native host machine driver from using the GPU device, you can use a PCI ID with the pci-stub driver. To do this, append the pci-stub.ids option, with the PCI IDs as its value, to the GRUB_CMDLINX_LINUX line located in the /etc/sysconfig/grub configuration file, for example as follows:
    GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
    vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
    vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ]  &&
    /usr/sbin/rhcrashkernel-param || :) rhgb quiet intel_iommu=on iommu=pt pci-stub.ids=10de:11fa"
    
    To add additional PCI IDs for pci-stub, separate them with a comma.
  3. Regenerate the boot loader configuration

    Regenerate the boot loader configuration using the grub2-mkconfig to include this option:
    # grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  4. Reboot the host machine

    In order for the changes to take effect, reboot the host machine:
    # reboot

Procedure 16.15. Optional: Editing the GPU IOMMU configuration

Prior to attaching the GPU device, editing its IOMMU configuration may be needed for the GPU to work properly on the guest.
  1. Display the XML information of the GPU

    To display the settings of the GPU in XML form, you first need to convert its PCI bus address to libvirt-compatible format by appending pci_ and converting delimiters to underscores. In this example, the GPU PCI device identified with the 0000:02:00.0 bus address (as obtained in the previous procedure) becomes pci_0000_02_00_0. Use the libvirt address of the device with the virsh nodedev-dumpxml to display its XML configuration:
    # virsh nodedev-dumpxml pci_0000_02_00_0
    
    <device>
     <name>pci_0000_02_00_0</name>
     <path>/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0</path>
     <parent>pci_0000_00_03_0</parent>
     <driver>
      <name>pci-stub</name>
     </driver>
     <capability type='pci'>
      <domain>0</domain>
      <bus>2</bus>
      <slot>0</slot>
      <function>0</function>
      <product id='0x11fa'>GK106GL [Quadro K4000]</product>
      <vendor id='0x10de'>NVIDIA Corporation</vendor>
         <!-- pay attention to the following lines -->
      <iommuGroup number='13'>
       <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
       <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
      </iommuGroup>
      <pci-express>
       <link validity='cap' port='0' speed='8' width='16'/>
       <link validity='sta' speed='2.5' width='16'/>
      </pci-express>
     </capability>
    </device>
    Note the <iommuGroup> element of the XML. The iommuGroup indicates a set of devices that are considered isolated from other devices due to IOMMU capabilities and PCI bus topologies. All of the endpoint devices within the iommuGroup (meaning devices that are not PCIe root ports, bridges, or switch ports) need to be unbound from the native host drivers in order to be assigned to a guest. In the example above, the group is composed of the GPU device (0000:02:00.0) as well as the companion audio device (0000:02:00.1). For more information, see Appendix E, Working with IOMMU Groups.
  2. Adjust IOMMU settings

    In this example, assignment of NVIDIA audio functions is not supported due to hardware issues with legacy interrupt support. In addition, the GPU audio function is generally not useful without the GPU itself. Therefore, in order to assign the GPU to a guest, the audio function must first be detached from native host drivers. This can be done using one of the following:

Procedure 16.16. Attaching the GPU

The GPU can be attached to the guest using any of the following methods:
  1. Using the Virtual Machine Manager interface. For details, see Section 16.1.2, “Assigning a PCI Device with virt-manager”.
  2. Creating an XML configuration fragment for the GPU and attaching it with the virsh attach-device:
    1. Create an XML for the device, similar to the following:
      
      <hostdev mode='subsystem' type='pci' managed='yes'>
       <driver name='vfio'/>
       <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
       </source>
      </hostdev>
    2. Save this to a file and run virsh attach-device [domain] [file] --persistent to include the XML in the guest configuration. Note that the assigned GPU is added in addition to the existing emulated graphics device in the guest machine. The assigned GPU is handled as a secondary graphics device in the virtual machine. Assignment as a primary graphics device is not supported and emulated graphics devices in the guest's XML should not be removed.
  3. Editing the guest XML configuration using the virsh edit command and adding the appropriate XML segment manually.

Procedure 16.17. Ḿodifying the Xorg configuration on the guest

The GPU's PCI bus address on the guest will be different than on the host. To enable the host to use the GPU properly, configure the guest's Xorg display server to use the assigned GPU address:
  1. In the guest, use the lspci command to determine the PCI bus adress of the GPU:
    # lspci | grep VGA
    00:02.0 VGA compatible controller: Device 1234:111
    00:09.0 VGA compatible controller: NVIDIA Corporation GK106GL [Quadro K4000] (rev a1)
    
    In this example, the bus address is 00:09.0.
  2. In the /etc/X11/xorg.conf file on the guest, add a BusID option with the detected address adjusted as follows:
    		Section "Device"
    		    Identifier     "Device0"
    		    Driver         "nvidia"
    		    VendorName     "NVIDIA Corporation"
    		    BusID          "PCI:0:9:0"
    		EndSection
    

    Important

    If the bus address detected in Step 1 is hexadecimal, you need to convert the values between delimiters to the decimal system. For example, 00:0a.0 should be converted into PCI:0:10:0.

Note

When using an assigned NVIDIA GPU in the guest, only the NVIDIA drivers are supported. Other drivers may not work and may generate errors. For a Red Hat Enterprise Linux 7 guest, the nouveau driver can be blacklisted using the option modprobe.blacklist=nouveau on the kernel command line during install. For information on other guest virtual machines, see the operating system's specific documentation.
Depending on the guest operating system, with the NVIDIA drivers loaded, the guest may support using both the emulated graphics and assigned graphics simultaneously or may disable the emulated graphics. Note that access to the assigned graphics framebuffer is not provided by applications such as virt-manager. If the assigned GPU is not connected to a physical display, guest-based remoting solutions may be necessary to access the GPU desktop. As with all PCI device assignment, migration of a guest with an assigned GPU is not supported and each GPU is owned exclusively by a single guest. Depending on the guest operating system, hot plug support of GPUs may be available.

16.7.2. NVIDIA vGPU Assignment

The NVIDIA vGPU feature makes it possible to divide a physical GPU device into multiple virtual devices referred to as mediated devices. These mediated devices can then be assigned to multiple guests as virtual GPUs. As a result, these guests share the performance of a single physical GPU.

Important

This feature is only available on a limited set of NVIDIA GPUs. For an up-to-date list of these devices, see the NVIDIA GPU Software Documentation.

NVIDIA vGPU Setup

To set up the vGPU feature, you first need to obtain NVIDIA vGPU drivers for your GPU device, then create mediated devices, and assign them to the intended guest machines:
  1. Obtain the NVIDIA vGPU drivers and install them on your system. For instructions, see the NVIDIA documentation.
  2. If the NVIDIA software installer did not create the /etc/modprobe.d/nvidia-installer-disable-nouveau.conf file, create a .conf file (of any name) in the /etc/modprobe.d/ directory. Add the following lines in the file:
    blacklist nouveau
    options nouveau modeset=0
    
  3. Regenerate the initial ramdisk for the current kernel, then reboot:
    # dracut --force
    # reboot
    If you need to use a prior supported kernel version with mediated devices, regenerate the initial ramdisk for all installed kernel versions:
    # dracut --regenerate-all --force
    # reboot
  4. Check that the nvidia_vgpu_vfio module has been loaded by the kernel and that the nvidia-vgpu-mgr.service service is running.
    # lsmod | grep nvidia_vgpu_vfio
    nvidia_vgpu_vfio 45011 0
    nvidia 14333621 10 nvidia_vgpu_vfio
    mdev 20414 2 vfio_mdev,nvidia_vgpu_vfio
    vfio 32695 3 vfio_mdev,nvidia_vgpu_vfio,vfio_iommu_type1
    # systemctl status nvidia-vgpu-mgr.service
    nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon
       Loaded: loaded (/usr/lib/systemd/system/nvidia-vgpu-mgr.service; enabled; vendor preset: disabled)
       Active: active (running) since Fri 2018-03-16 10:17:36 CET; 5h 8min ago
     Main PID: 1553 (nvidia-vgpu-mgr)
     [...]
    
  5. Write a device UUID to /sys/class/mdev_bus/pci_dev/mdev_supported_types/type-id/create, where pci_dev is the PCI address of the host GPU, and type-id is an ID of the host GPU type.
    The following example shows how to create a mediated device of nvidia-63 vGPU type on an NVIDIA Tesla P4 card:
    # uuidgen
    30820a6f-b1a5-4503-91ca-0c10ba58692a
    # echo "30820a6f-b1a5-4503-91ca-0c10ba58692a" > /sys/class/mdev_bus/0000:01:00.0/mdev_supported_types/nvidia-63/create
    For type-id values for specific devices, see section 1.3.1. Virtual GPU Types in Virtual GPU software documentation. Note that only Q-series NVIDIA vGPUs, such as GRID P4-2Q, are supported as mediated device GPU types on Linux guests.
  6. Add the following lines to the <devices/> sections in XML configurations of guests that you want to share the vGPU resources. Use the UUID value generated by the uuidgen command in the previous step. Each UUID can only be assigned to one guest at a time.
    
    <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'>
      <source>
        <address uuid='30820a6f-b1a5-4503-91ca-0c10ba58692a'/>
      </source>
    </hostdev>
    

    Important

    For the vGPU mediated devices to work properly on the assigned guests, NVIDIA vGPU guest software licensing needs to be set up for the guests. For further information and instructions, see the NVIDIA virtual GPU software documentation.

Removing NVIDIA vGPU Devices

To remove a mediated vGPU device, use the following command when the device is inactive, and replace uuid with the UUID of the device, for example 30820a6f-b1a5-4503-91ca-0c10ba58692a.
# echo 1 > /sys/bus/mdev/devices/uuid/remove
Note that attempting to remove a vGPU device that is currently in use by a guest triggers the following error:
echo: write error: Device or resource busy

Querying NVIDIA vGPU Capabilities

To obtain additional information about the mediated devices on your system, such as how many mediated devices of a given type can be created, use the virsh nodedev-list --cap mdev_types and virsh nodedev-dumpxml commands. For example, the following displays available vGPU types on a Tesla P4 card:

$ virsh nodedev-list --cap mdev_types
pci_0000_01_00_0
$ virsh nodedev-dumpxml pci_0000_01_00_0
<...>
  <capability type='mdev_types'>
    <type id='nvidia-70'>
      <name>GRID P4-8A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>1</availableInstances>
    </type>
    <type id='nvidia-69'>
      <name>GRID P4-4A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>2</availableInstances>
    </type>
    <type id='nvidia-67'>
      <name>GRID P4-1A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>8</availableInstances>
    </type>
    <type id='nvidia-65'>
      <name>GRID P4-4Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>2</availableInstances>
    </type>
    <type id='nvidia-63'>
      <name>GRID P4-1Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>8</availableInstances>
    </type>
    <type id='nvidia-71'>
      <name>GRID P4-1B</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>8</availableInstances>
    </type>
    <type id='nvidia-68'>
      <name>GRID P4-2A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>4</availableInstances>
    </type>
    <type id='nvidia-66'>
      <name>GRID P4-8Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>1</availableInstances>
    </type>
    <type id='nvidia-64'>
      <name>GRID P4-2Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>4</availableInstances>
    </type>
  </capability>
</...>

Remote Desktop Streaming Services for NVIDIA vGPU

The following remote desktop streaming services have been successfully tested for use with the NVIDIA vGPU feature on Red Hat Enterprise Linux 7:
  • HP-RGS
  • Mechdyne TGX - It is currently not possible to use Mechdyne TGX with Windows Server 2016 guests.
  • NICE DCV - When using this streaming service, Red Hat recommends using fixed resolution settings, as using dynamic resolution in some cases results in a black screen.