D.3. How to Identify and Assign IOMMU Groups

This example demonstrates how to identify and assign the PCI devices that are present on the target system. For additional examples and information, refer to Section 17.7, “Assigning GPU Devices”.

Procedure D.1. IOMMU groups

  1. List the devices

    Identify the devices in your system by running the virsh nodev-list device-type command. This example demonstrates how to locate the PCI devices. The output has been truncated for brevity.
    # virsh nodedev-list pci
    
    pci_0000_00_00_0
    pci_0000_00_01_0
    pci_0000_00_03_0
    pci_0000_00_07_0
    [...]
    pci_0000_00_1c_0
    pci_0000_00_1c_4
    [...]
    pci_0000_01_00_0
    pci_0000_01_00_1
    [...]
    pci_0000_03_00_0
    pci_0000_03_00_1
    pci_0000_04_00_0
    pci_0000_05_00_0
    pci_0000_06_0d_0
    
  2. Locate the IOMMU grouping of a device

    For each device listed, further information about the device, including the IOMMU grouping, can be found using the virsh nodedev-dumpxml name-of-device command. For example, to find the IOMMU grouping for the PCI device named pci_0000_04_00_0 (PCI address 0000:04:00.0), use the following command:
    # virsh nodedev-dumpxml pci_0000_04_00_0
    This command generates a XML dump similar to the one shown.
    
    <device>
      <name>pci_0000_04_00_0</name>
      <path>/sys/devices/pci0000:00/0000:00:1c.0/0000:04:00.0</path>
      <parent>pci_0000_00_1c_0</parent>
      <capability type='pci'>
        <domain>0</domain>
        <bus>4</bus>
        <slot>0</slot>
        <function>0</function>
        <product id='0x10d3'>82574L Gigabit Network Connection</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
        <iommuGroup number='8'>   <!--This is the element block you will need to use-->
          <address domain='0x0000' bus='0x00' slot='0x1c' function='0x0'/>
          <address domain='0x0000' bus='0x00' slot='0x1c' function='0x4'/>
          <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
          <address domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
        </iommuGroup>
        <pci-express>
          <link validity='cap' port='0' speed='2.5' width='1'/>
          <link validity='sta' speed='2.5' width='1'/>
        </pci-express>
      </capability>
    </device>

    Figure D.1. IOMMU Group XML

  3. View the PCI data

    In the output collected above, there is one IOMMU group with 4 devices. This is an example of a multi-function PCIe root port without ACS support. The two functions in slot 0x1c are PCIe root ports, which can be identified by running the lspci command (from the pciutils package):
    # lspci -s 1c
    
    00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
    00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 5
    
    
    Repeat this step for the two PCIe devices on buses 0x04 and 0x05, which are endpoint devices.
    # lspci -s 4
    04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection This is used in the next step and is called 04:00.0
    # lspci -s 5 This is used in the next step and is called 05:00.0
    05:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755 Gigabit Ethernet PCI Express (rev 02)
    
  4. Assign the endpoints to the guest virtual machine

    In order to assign either one of the endpoints to a virtual machine, the endpoint which you are not assigning at the moment, must be bound to a VFIO compatible driver so that the IOMMU group is not split between user and host drivers. If for example, using the output received above, you were to configuring a virtual machine with only 04:00.0, the virtual machine will fail to start unless 05:00.0 is detached from host drivers. To detach 05:00.0, run the virsh nodedev-detach command as root:
    # virsh nodedev-detach pci_0000_05_00_0
    Device pci_0000_05_00_0 detached
    
    Assigning both endpoints to the virtual machine is another option for resolving this issue. Note that libvirt will automatically perform this operation for the attached devices when using the yes value for the managed attribute within the <hostdev> element. For example: <hostdev mode='subsystem' type='pci' managed='yes'>. Refer to Note for more information.

Note

libvirt has two ways to handle PCI devices. They can be either managed or unmanaged. This is determined by the value given to the managed attribute within the <hostdev> element. When the device is managed, libvirt automatically detaches the device from the existing driver and then assigns it to the virtual machine by binding it to vfio-pci on boot (for the virtual machine). When the virtual machine is shutdown or deleted or the PCI device is detached from the virtual machine, libvirt unbinds the device from vfio-pci and rebinds it to the original driver. If the device is unmanaged, libvirt will not automate the process and you will have to ensure all of these management aspects as described are done before assigning the device to a virtual machine, and after the device is no longer used by the virtual machine you will have to reassign the devices as well. Failure to do these actions in an unmanaged device will cause the virtual machine to fail. Therefore, it may be easier to make sure that libvirt manages the device.