Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

Chapter 16. Guest Virtual Machine Device Configuration

Red Hat Enterprise Linux 7 supports three classes of devices for guest virtual machines:
  • Emulated devices are purely virtual devices that mimic real hardware, allowing unmodified guest operating systems to work with them using their standard in-box drivers.
  • Virtio devices (also known as paravirtualized) are purely virtual devices designed to work optimally in a virtual machine. Virtio devices are similar to emulated devices, but non-Linux virtual machines do not include the drivers they require by default. Virtualization management software like the Virtual Machine Manager (virt-manager) and the Red Hat Virtualization Hypervisor install these drivers automatically for supported non-Linux guest operating systems. Red Hat Enterprise Linux 7 supports up to 216 virtio devices. For more information, see Chapter 5, KVM Paravirtualized (virtio) Drivers.
  • Assigned devices are physical devices that are exposed to the virtual machine. This method is also known as passthrough. Device assignment allows virtual machines exclusive access to PCI devices for a range of tasks, and allows PCI devices to appear and behave as if they were physically attached to the guest operating system. Red Hat Enterprise Linux 7 supports up to 32 assigned devices per virtual machine.
    Device assignment is supported on PCIe devices, including select graphics devices. Parallel PCI devices may be supported as assigned devices, but they have severe limitations due to security and system configuration conflicts.
Red Hat Enterprise Linux 7 supports PCI hot plug of devices exposed as single-function slots to the virtual machine. Single-function host devices and individual functions of multi-function host devices may be configured to enable this. Configurations exposing devices as multi-function PCI slots to the virtual machine are recommended only for non-hotplug applications.
For more information on specific devices and related limitations, see Section 23.17, “Devices”.

Note

Platform support for interrupt remapping is required to fully isolate a guest with assigned devices from the host. Without such support, the host may be vulnerable to interrupt injection attacks from a malicious guest. In an environment where guests are trusted, the admin may opt-in to still allow PCI device assignment using the allow_unsafe_interrupts option to the vfio_iommu_type1 module. This may either be done persistently by adding a .conf file (for example local.conf) to /etc/modprobe.d containing the following:
options vfio_iommu_type1 allow_unsafe_interrupts=1
or dynamically using the sysfs entry to do the same:
# echo 1 > /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts

16.1. PCI Devices

PCI device assignment is only available on hardware platforms supporting either Intel VT-d or AMD IOMMU. These Intel VT-d or AMD IOMMU specifications must be enabled in the host BIOS for PCI device assignment to function.

Procedure 16.1. Preparing an Intel system for PCI device assignment

  1. Enable the Intel VT-d specifications

    The Intel VT-d specifications provide hardware support for directly assigning a physical device to a virtual machine. These specifications are required to use PCI device assignment with Red Hat Enterprise Linux.
    The Intel VT-d specifications must be enabled in the BIOS. Some system manufacturers disable these specifications by default. The terms used to see these specifications can differ between manufacturers; consult your system manufacturer's documentation for the appropriate terms.
  2. Activate Intel VT-d in the kernel

    Activate Intel VT-d in the kernel by adding the intel_iommu=on and iommu=pt parameters to the end of the GRUB_CMDLINX_LINUX line, within the quotes, in the /etc/sysconfig/grub file.
    The example below is a modified grub file with Intel VT-d activated.
    GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
    vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
    vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/
    rhcrashkernel-param || :) rhgb quiet intel_iommu=on iommu=pt"
  3. Regenerate config file

    Regenerate /etc/grub2.cfg by running:
    grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  4. Ready to use

    Reboot the system to enable the changes. Your system is now capable of PCI device assignment.

Procedure 16.2. Preparing an AMD system for PCI device assignment

  1. Enable the AMD IOMMU specifications

    The AMD IOMMU specifications are required to use PCI device assignment in Red Hat Enterprise Linux. These specifications must be enabled in the BIOS. Some system manufacturers disable these specifications by default.
  2. Enable IOMMU kernel support

    Append iommu=pt to the end of the GRUB_CMDLINX_LINUX line, within the quotes, in /etc/sysconfig/grub so that AMD IOMMU specifications are enabled at boot.
  3. Regenerate config file

    Regenerate /etc/grub2.cfg by running:
    grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  4. Ready to use

    Reboot the system to enable the changes. Your system is now capable of PCI device assignment.

Note

For further information on IOMMU, see Appendix E, Working with IOMMU Groups.

16.1.1. Assigning a PCI Device with virsh

These steps cover assigning a PCI device to a virtual machine on a KVM hypervisor.
This example uses a PCIe network controller with the PCI identifier code, pci_0000_01_00_0, and a fully virtualized guest machine named guest1-rhel7-64.

Procedure 16.3. Assigning a PCI device to a guest virtual machine with virsh

  1. Identify the device

    First, identify the PCI device designated for device assignment to the virtual machine. Use the lspci command to list the available PCI devices. You can refine the output of lspci with grep.
    This example uses the Ethernet controller highlighted in the following output:
    # lspci | grep Ethernet
    00:19.0 Ethernet controller: Intel Corporation 82567LM-2 Gigabit Network Connection
    01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    This Ethernet controller is shown with the short identifier 00:19.0. We need to find out the full identifier used by virsh in order to assign this PCI device to a virtual machine.
    To do so, use the virsh nodedev-list command to list all devices of a particular type (pci) that are attached to the host machine. Then look at the output for the string that maps to the short identifier of the device you wish to use.
    This example shows the string that maps to the Ethernet controller with the short identifier 00:19.0. Note that the : and . characters are replaced with underscores in the full identifier.
    # virsh nodedev-list --cap pci
    pci_0000_00_00_0
    pci_0000_00_01_0
    pci_0000_00_03_0
    pci_0000_00_07_0
    pci_0000_00_10_0
    pci_0000_00_10_1
    pci_0000_00_14_0
    pci_0000_00_14_1
    pci_0000_00_14_2
    pci_0000_00_14_3
    pci_0000_00_19_0
    pci_0000_00_1a_0
    pci_0000_00_1a_1
    pci_0000_00_1a_2
    pci_0000_00_1a_7
    pci_0000_00_1b_0
    pci_0000_00_1c_0
    pci_0000_00_1c_1
    pci_0000_00_1c_4
    pci_0000_00_1d_0
    pci_0000_00_1d_1
    pci_0000_00_1d_2
    pci_0000_00_1d_7
    pci_0000_00_1e_0
    pci_0000_00_1f_0
    pci_0000_00_1f_2
    pci_0000_00_1f_3
    pci_0000_01_00_0
    pci_0000_01_00_1
    pci_0000_02_00_0
    pci_0000_02_00_1
    pci_0000_06_00_0
    pci_0000_07_02_0
    pci_0000_07_03_0
    Record the PCI device number that maps to the device you want to use; this is required in other steps.
  2. Review device information

    Information on the domain, bus, and function are available from output of the virsh nodedev-dumpxml command:
    
    # virsh nodedev-dumpxml pci_0000_00_19_0
    <device>
      <name>pci_0000_00_19_0</name>
      <parent>computer</parent>
      <driver>
        <name>e1000e</name>
      </driver>
      <capability type='pci'>
        <domain>0</domain>
        <bus>0</bus>
        <slot>25</slot>
        <function>0</function>
        <product id='0x1502'>82579LM Gigabit Network Connection</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
        <iommuGroup number='7'>
          <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
        </iommuGroup>
      </capability>
    </device>
    

    Figure 16.1. Dump contents

    Note

    An IOMMU group is determined based on the visibility and isolation of devices from the perspective of the IOMMU. Each IOMMU group may contain one or more devices. When multiple devices are present, all endpoints within the IOMMU group must be claimed for any device within the group to be assigned to a guest. This can be accomplished either by also assigning the extra endpoints to the guest or by detaching them from the host driver using virsh nodedev-detach. Devices contained within a single group may not be split between multiple guests or split between host and guest. Non-endpoint devices such as PCIe root ports, switch ports, and bridges should not be detached from the host drivers and will not interfere with assignment of endpoints.
    Devices within an IOMMU group can be determined using the iommuGroup section of the virsh nodedev-dumpxml output. Each member of the group is provided in a separate "address" field. This information may also be found in sysfs using the following:
    $ ls /sys/bus/pci/devices/0000:01:00.0/iommu_group/devices/
    An example of the output from this would be:
    0000:01:00.0  0000:01:00.1
    To assign only 0000.01.00.0 to the guest, the unused endpoint should be detached from the host before starting the guest:
    $ virsh nodedev-detach pci_0000_01_00_1
  3. Determine required configuration details

    See the output from the virsh nodedev-dumpxml pci_0000_00_19_0 command for the values required for the configuration file.
    The example device has the following values: bus = 0, slot = 25 and function = 0. The decimal configuration uses those three values:
    bus='0'
    slot='25'
    function='0'
  4. Add configuration details

    Run virsh edit, specifying the virtual machine name, and add a device entry in the <devices> section to assign the PCI device to the guest virtual machine. For example:
    # virsh edit guest1-rhel7-64
    
    <devices>
    	[...]
     <hostdev mode='subsystem' type='pci' managed='yes'>
       <source>
          <address domain='0' bus='0' slot='25' function='0'/>
       </source>
     </hostdev>
     [...]
    </devices>
    

    Figure 16.2. Add PCI device

    Alternately, run virsh attach-device, specifying the virtual machine name and the guest's XML file:
    virsh attach-device guest1-rhel7-64 file.xml

    Note

    PCI devices may include an optional read-only memory (ROM) module, also known as an option ROM or expansion ROM, for delivering device firmware or pre-boot drivers (such as PXE) for the device. Generally, these option ROMs also work in a virtualized environment when using PCI device assignment to attach a physical PCI device to a VM.
    However, in some cases, the option ROM can be unnecessary, which may cause the VM to boot more slowly, or the pre-boot driver delivered by the device can be incompatible with virtualization, which may cause the guest OS boot to fail. In such cases, Red Hat recommends masking the option ROM from the VM. To do so:
    1. On the host, verify that the device to assign has an expansion ROM base address register (BAR). To do so, use the lspci -v command for the device, and check the output for a line that includes the following:
      Expansion ROM at
    2. Add the <rom bar='off'/> element as a child of the <hostdev> element in the guest's XML configuration:
      <hostdev mode='subsystem' type='pci' managed='yes'>
        <source>
           <address domain='0' bus='0' slot='25' function='0'/>
        </source>
        <rom bar='off'/>
      </hostdev>
      
  5. Start the virtual machine

    # virsh start guest1-rhel7-64
The PCI device should now be successfully assigned to the virtual machine, and accessible to the guest operating system.

16.1.2. Assigning a PCI Device with virt-manager

PCI devices can be added to guest virtual machines using the graphical virt-manager tool. The following procedure adds a Gigabit Ethernet controller to a guest virtual machine.

Procedure 16.4. Assigning a PCI device to a guest virtual machine using virt-manager

  1. Open the hardware settings

    Open the guest virtual machine and click the Add Hardware button to add a new device to the virtual machine.
    The virtual machine hardware window with the Information button selected on the top taskbar and Overview selected on the left menu pane.

    Figure 16.3. The virtual machine hardware information window

  2. Select a PCI device

    Select PCI Host Device from the Hardware list on the left.
    Select an unused PCI device. Note that selecting PCI devices presently in use by another guest causes errors. In this example, a spare audio controller is used. Click Finish to complete setup.
    The Add new virtual hardware wizard with PCI Host Device selected on the left menu pane, showing a list of host devices for selection in the right menu pane.

    Figure 16.4. The Add new virtual hardware wizard

  3. Add the new device

    The setup is complete and the guest virtual machine now has direct access to the PCI device.
    The virtual machine hardware window with the Information button selected on the top taskbar and Overview selected on the left menu pane, displaying the newly added PCI Device in the list of virtual machine devices in the left menu pane.

    Figure 16.5. The virtual machine hardware information window

Note

If device assignment fails, there may be other endpoints in the same IOMMU group that are still attached to the host. There is no way to retrieve group information using virt-manager, but virsh commands can be used to analyze the bounds of the IOMMU group and if necessary sequester devices.
See the Note in Section 16.1.1, “Assigning a PCI Device with virsh” for more information on IOMMU groups and how to detach endpoint devices using virsh.

16.1.3. PCI Device Assignment with virt-install

It is possible to assign a PCI device when installing a guest using the virt-install command. To do this, use the --host-device parameter.

Procedure 16.5. Assigning a PCI device to a virtual machine with virt-install

  1. Identify the device

    Identify the PCI device designated for device assignment to the guest virtual machine.
    # lspci | grep Ethernet
    00:19.0 Ethernet controller: Intel Corporation 82567LM-2 Gigabit Network Connection
    01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    The virsh nodedev-list command lists all devices attached to the system, and identifies each PCI device with a string. To limit output to only PCI devices, enter the following command:
    # virsh nodedev-list --cap pci
    pci_0000_00_00_0
    pci_0000_00_01_0
    pci_0000_00_03_0
    pci_0000_00_07_0
    pci_0000_00_10_0
    pci_0000_00_10_1
    pci_0000_00_14_0
    pci_0000_00_14_1
    pci_0000_00_14_2
    pci_0000_00_14_3
    pci_0000_00_19_0
    pci_0000_00_1a_0
    pci_0000_00_1a_1
    pci_0000_00_1a_2
    pci_0000_00_1a_7
    pci_0000_00_1b_0
    pci_0000_00_1c_0
    pci_0000_00_1c_1
    pci_0000_00_1c_4
    pci_0000_00_1d_0
    pci_0000_00_1d_1
    pci_0000_00_1d_2
    pci_0000_00_1d_7
    pci_0000_00_1e_0
    pci_0000_00_1f_0
    pci_0000_00_1f_2
    pci_0000_00_1f_3
    pci_0000_01_00_0
    pci_0000_01_00_1
    pci_0000_02_00_0
    pci_0000_02_00_1
    pci_0000_06_00_0
    pci_0000_07_02_0
    pci_0000_07_03_0
    Record the PCI device number; the number is needed in other steps.
    Information on the domain, bus and function are available from output of the virsh nodedev-dumpxml command:
    # virsh nodedev-dumpxml pci_0000_01_00_0
    
    <device>
      <name>pci_0000_01_00_0</name>
      <parent>pci_0000_00_01_0</parent>
      <driver>
        <name>igb</name>
      </driver>
      <capability type='pci'>
        <domain>0</domain>
        <bus>1</bus>
        <slot>0</slot>
        <function>0</function>
        <product id='0x10c9'>82576 Gigabit Network Connection</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
        <iommuGroup number='7'>
          <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
        </iommuGroup>
      </capability>
    </device>
    

    Figure 16.6. PCI device file contents

    Note

    If there are multiple endpoints in the IOMMU group and not all of them are assigned to the guest, you will need to manually detach the other endpoint(s) from the host by running the following command before you start the guest:
    $ virsh nodedev-detach pci_0000_00_19_1
    See the Note in Section 16.1.1, “Assigning a PCI Device with virsh” for more information on IOMMU groups.
  2. Add the device

    Use the PCI identifier output from the virsh nodedev command as the value for the --host-device parameter.
    virt-install \
    --name=guest1-rhel7-64 \
    --disk path=/var/lib/libvirt/images/guest1-rhel7-64.img,size=8 \
    --vcpus=2 --ram=2048 \
    --location=http://example1.com/installation_tree/RHEL7.0-Server-x86_64/os \
    --nonetworks \
    --os-type=linux \
    --os-variant=rhel7
    --host-device=pci_0000_01_00_0
  3. Complete the installation

    Complete the guest installation. The PCI device should be attached to the guest.

16.1.4. Detaching an Assigned PCI Device

When a host PCI device has been assigned to a guest machine, the host can no longer use the device. If the PCI device is in managed mode (configured using the managed='yes' parameter in the domain XML file), it attaches to the guest machine and detaches from the guest machine and re-attaches to the host machine as necessary. If the PCI device is not in managed mode, you can detach the PCI device from the guest machine and re-attach it using virsh or virt-manager.

Procedure 16.6. Detaching a PCI device from a guest with virsh

  1. Detach the device

    Use the following command to detach the PCI device from the guest by removing it in the guest's XML file:
    # virsh detach-device name_of_guest file.xml
  2. Re-attach the device to the host (optional)

    If the device is in managed mode, skip this step. The device will be returned to the host automatically.
    If the device is not using managed mode, use the following command to re-attach the PCI device to the host machine:
    # virsh nodedev-reattach device
    For example, to re-attach the pci_0000_01_00_0 device to the host:
    # virsh nodedev-reattach pci_0000_01_00_0
    The device is now available for host use.

Procedure 16.7. Detaching a PCI Device from a guest with virt-manager

  1. Open the virtual hardware details screen

    In virt-manager, double-click the virtual machine that contains the device. Select the Show virtual hardware details button to display a list of virtual hardware.
    The Show virtual hardware details button.

    Figure 16.7. The virtual hardware details button

  2. Select and remove the device

    Select the PCI device to be detached from the list of virtual devices in the left panel.
    The PCI device details and the Remove button.

    Figure 16.8. Selecting the PCI device to be detached

    Click the Remove button to confirm. The device is now available for host use.

16.1.5. PCI Bridges

Peripheral Component Interconnects (PCI) bridges are used to attach to devices such as network cards, modems and sound cards. Just like their physical counterparts, virtual devices can also be attached to a PCI Bridge. In the past, only 31 PCI devices could be added to any guest virtual machine. Now, when a 31st PCI device is added, a PCI bridge is automatically placed in the 31st slot, moving the additional PCI device to the PCI bridge. Each PCI bridge has 31 slots for 31 additional devices, all of which can be bridges. In this manner, over 900 devices can be available for guest virtual machines.
For an example of an XML configuration for PCI bridges, see Domain XML example for PCI Bridge. Note that this configuration is set up automatically, and it is not recommended to adjust manually.

16.1.6. PCI Device Assignment Restrictions

PCI device assignment (attaching PCI devices to virtual machines) requires host systems to have AMD IOMMU or Intel VT-d support to enable device assignment of PCIe devices.
Red Hat Enterprise Linux 7 has limited PCI configuration space access by guest device drivers. This limitation could cause drivers that are dependent on device capabilities or features present in the extended PCI configuration space, to fail configuration.
There is a limit of 32 total assigned devices per Red Hat Enterprise Linux 7 virtual machine. This translates to 32 total PCI functions, regardless of the number of PCI bridges present in the virtual machine or how those functions are combined to create multi-function slots.
Platform support for interrupt remapping is required to fully isolate a guest with assigned devices from the host. Without such support, the host may be vulnerable to interrupt injection attacks from a malicious guest. In an environment where guests are trusted, the administrator may opt-in to still allow PCI device assignment using the allow_unsafe_interrupts option to the vfio_iommu_type1 module. This may either be done persistently by adding a .conf file (for example local.conf) to /etc/modprobe.d containing the following:
options vfio_iommu_type1 allow_unsafe_interrupts=1
or dynamically using the sysfs entry to do the same:
# echo 1 > /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts