Virtualization Deployment and Administration Guide

Red Hat Enterprise Linux 7

Installing, configuring, and managing virtual machines on a Red Hat Enterprise Linux physical machine

Jiri Herrmann

Red Hat Customer Content Services

Yehuda Zimmerman

Red Hat Customer Content Services

Laura Novich

Red Hat Customer Content Services

Dayle Parker

Red Hat Customer Content Services

Scott Radvan

Red Hat Customer Content Services

Tahlia Richardson

Red Hat Customer Content Services

Abstract

This guide covers how to configure a Red Hat Enterprise Linux 7 machine to act as a virtualization host system, and how to install and configure guest virtual machines using the KVM hypervisor. Other topics include PCI device configuration, SR-IOV, networking, storage, device and guest virtual machine management, as well as troubleshooting, compatibility and restrictions. Procedures that need to be run on the guest virtual machine are explicitly marked as such.

Important

All procedures described in this guide are intended to be performed on an AMD64 or Intel 64 host machine, unless otherwise stated. For using Red Hat Enterprise Linux 7 virtualization on architectures other than AMD64 and Intel 64, see Appendix B, Using KVM Virtualization on Multiple Architectures.
For a more general introduction into virtualization solutions provided by Red Hat, see the Red Hat Enterprise Linux 7 Virtualization Getting Started Guide.

Part I. Deployment

Chapter 1. System Requirements

Virtualization is available with the KVM hypervisor for Red Hat Enterprise Linux 7 on the Intel 64 and AMD64 architectures. This chapter lists system requirements for running virtual machines, also referred to as VMs.
For information on installing the virtualization packages, see Chapter 2, Installing the Virtualization Packages.

1.1. Host System Requirements

Minimum host system requirements

  • 6 GB free disk space.
  • 2 GB RAM.

Recommended system requirements

  • One core or thread for each virtualized CPU and one for the host.
  • 2 GB of RAM, plus additional RAM for virtual machines.
  • 6 GB disk space for the host, plus the required disk space for the virtual machine(s).
    Most guest operating systems require at least 6 GB of disk space. Additional storage space for each guest depends on their workload.

    Swap space

    Swap space in Linux is used when the amount of physical memory (RAM) is full. If the system needs more memory resources and the RAM is full, inactive pages in memory are moved to the swap space. While swap space can help machines with a small amount of RAM, it should not be considered a replacement for more RAM. Swap space is located on hard drives, which have a slower access time than physical memory. The size of your swap partition can be calculated from the physical RAM of the host. The Red Hat Customer Portal contains an article on safely and efficiently determining the size of the swap partition: https://access.redhat.com/site/solutions/15244.
    • When using raw image files, the total disk space required is equal to or greater than the sum of the space required by the image files, the 6 GB of space required by the host operating system, and the swap space for the guest.
      For qcow images, you must also calculate the expected maximum storage requirements of the guest (total for qcow format), as qcow and qcow2 images are able to grow as required. To allow for this expansion, first multiply the expected maximum storage requirements of the guest (expected maximum guest storage) by 1.01, and add to this the space required by the host (host), and the necessary swap space (swap).
Guest virtual machine requirements are further outlined in Chapter 7, Overcommitting with KVM.

1.2. KVM Hypervisor Requirements

The KVM hypervisor requires:
  • an Intel processor with the Intel VT-x and Intel 64 virtualization extensions for x86-based systems; or
  • an AMD processor with the AMD-V and the AMD64 virtualization extensions.
Virtualization extensions (Intel VT-x or AMD-V) are required for full virtualization. Enter the following commands to determine whether your system has the hardware virtualization extensions, and that they are enabled.

Procedure 1.1. Verifying virtualization extensions

  1. Verify the CPU virtualization extensions are available

    enter the following command to verify the CPU virtualization extensions are available:
    $ grep -E 'svm|vmx' /proc/cpuinfo
  2. Analyze the output

    • The following example output contains a vmx entry, indicating an Intel processor with the Intel VT-x extension:
      flags   : fpu tsc msr pae mce cx8 vmx apic mtrr mca cmov pat pse36 clflush
      dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl
      vmx est tm2 cx16 xtpr lahf_lm
      
    • The following example output contains an svm entry, indicating an AMD processor with the AMD-V extensions:
      flags   :  fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush
      mmx fxsr sse sse2 ht syscall nx mmxext svm fxsr_opt lm 3dnowext 3dnow pni cx16
      lahf_lm cmp_legacy svm cr8legacy ts fid vid ttp tm stc
      
    If the grep -E 'svm|vmx' /proc/cpuinfo command returns any output, the processor contains the hardware virtualization extensions. In some circumstances, manufacturers disable the virtualization extensions in the BIOS. If the extensions do not appear, or full virtualization does not work, see Procedure A.3, “Enabling virtualization extensions in BIOS” for instructions on enabling the extensions in your BIOS configuration utility.
  3. Ensure the KVM kernel modules are loaded

    As an additional check, verify that the kvm modules are loaded in the kernel with the following command:
    # lsmod | grep kvm
    If the output includes kvm_intel or kvm_amd, the kvm hardware virtualization modules are loaded.

Note

The virsh utility (provided by the libvirt-client package) can output a full list of your system's virtualization capabilities with the following command:
# virsh capabilities

1.3. KVM Guest Virtual Machine Compatibility

Red Hat Enterprise Linux 7 servers have certain support limits.
The following URLs explain the processor and memory amount limitations for Red Hat Enterprise Linux:
The following URL lists guest operating systems certified to run on a Red Hat Enterprise Linux KVM host:

Note

For additional information on the KVM hypervisor's restrictions and support limits, see Appendix C, Virtualization Restrictions.

1.4. Supported Guest CPU Models

Every hypervisor has its own policy for which CPU features the guest will see by default. The set of CPU features presented to the guest by the hypervisor depends on the CPU model chosen in the guest virtual machine configuration.

1.4.1. Listing the Guest CPU Models

To view a full list of the CPU models supported for an architecture type, run the virsh cpu-models architecture command. For example:
$ virsh cpu-models x86_64
486
pentium
pentium2
pentium3
pentiumpro
coreduo
n270
core2duo
qemu32
kvm32
cpu64-rhel5
cpu64-rhel6
kvm64
qemu64
Conroe
Penryn
Nehalem
Westmere
SandyBridge
Haswell
athlon
phenom
Opteron_G1
Opteron_G2
Opteron_G3
Opteron_G4
Opteron_G5
$ virsh cpu-models ppc64
POWER7
POWER7_v2.1
POWER7_v2.3
POWER7+_v2.1
POWER8_v1.0
The full list of supported CPU models and features is contained in the cpu_map.xml file, located in /usr/share/libvirt/:
# cat /usr/share/libvirt/cpu_map.xml
A guest's CPU model and features can be changed in the <cpu> section of the domain XML file. See Section 23.12, “CPU Models and Topology” for more information.
The host model can be configured to use a specified feature set as needed. For more information, see Section 23.12.1, “Changing the Feature Set for a Specified CPU”.

Chapter 2. Installing the Virtualization Packages

To use virtualization, Red Hat virtualization packages must be installed on your computer. Virtualization packages can be installed when installing Red Hat Enterprise Linux or after installation using the yum command and Subscription Manager.
The KVM hypervisor uses the default Red Hat Enterprise Linux kernel with the kvm kernel module.

2.1. Installing Virtualization Packages During a Red Hat Enterprise Linux Installation

This section provides information about installing virtualization packages while installing Red Hat Enterprise Linux.

Note

For detailed information about installing Red Hat Enterprise Linux, see the Red Hat Enterprise Linux 7 Installation Guide.

Important

The Anaconda interface only offers the option to install Red Hat virtualization packages during the installation of Red Hat Enterprise Linux Server.
When installing a Red Hat Enterprise Linux Workstation, the Red Hat virtualization packages can only be installed after the workstation installation is complete. See Section 2.2, “Installing Virtualization Packages on an Existing Red Hat Enterprise Linux System”

Procedure 2.1. Installing virtualization packages

  1. Select software

    Follow the installation procedure until the Installation Summary screen.
    Image shows the Installation Summary screen, which lists configurable options under group headings. Under the Localization heading : Date & Time, Language Support, and Keyboard. Under the Software heading: Installation Source and Software Selection. Under the System heading: Installation Destination and Network & Hostname.

    Figure 2.1. The Installation Summary screen

    In the Installation Summary screen, click Software Selection. The Software Selection screen opens.
  2. Select the server type and package groups

    You can install Red Hat Enterprise Linux 7 with only the basic virtualization packages or with packages that allow management of guests through a graphical user interface. Do one of the following:
    • Install a minimal virtualization host
      Select the Virtualization Host radio button in the Base Environment pane and the Virtualization Platform check box in the Add-Ons for Selected Environment pane. This installs a basic virtualization environment which can be run with virsh or remotely over the network.
      Image shows the Software Selection screen, which lists options under two headings: Base Environment and Add-Ons for Selected Environment. Virtualization Host is highlighted from the options under Base Environment, and Virtualization Platform is highlighted from the options under Add-Ons for Selected Environment.

      Figure 2.2. Virtualization Host selected in the Software Selection screen

    • Install a virtualization host with a graphical user interface
      Select the Server with GUI radio button in the Base Environment pane and the Virtualization Client, Virtualization Hypervisor, and Virtualization Tools check boxes in the Add-Ons for Selected Environment pane. This installs a virtualization environment along with graphical tools for installing and managing guest virtual machines.
      Image shows the Software Selection screen, which lists options under two headings: Base Environment and Add-Ons for Selected Environment. Server with GUI is highlighted from the options under Base Environment, and Virtualization Client, Virtualization Hypervisor, and Virtualization Tools are highlighted from the options under Add-Ons for Selected Environment.

      Figure 2.3. Server with GUI selected in the software selection screen

  3. Finalize installation

    Click Done and continue with the installation.

Important

You need a valid Red Hat Enterprise Linux subscription to receive updates for the virtualization packages.

2.1.1. Installing KVM Packages with Kickstart Files

To use a Kickstart file to install Red Hat Enterprise Linux with the virtualization packages, append the following package groups in the %packages section of your Kickstart file:
@virtualization-hypervisor
@virtualization-client
@virtualization-platform
@virtualization-tools
For more information about installing with Kickstart files, see the Red Hat Enterprise Linux 7 Installation Guide.

2.2. Installing Virtualization Packages on an Existing Red Hat Enterprise Linux System

This section describes the steps for installing the KVM hypervisor on an existing Red Hat Enterprise Linux 7 system.
To install the packages, your machine must be registered and subscribed to the Red Hat Customer Portal. To register using Red Hat Subscription Manager, run the subscription-manager register command and follow the prompts. Alternatively, run the Red Hat Subscription Manager application from ApplicationsSystem Tools on the desktop to register.
If you do not have a valid Red Hat subscription, visit the Red Hat online store to obtain one. For more information on registering and subscribing a system to the Red Hat Customer Portal, see https://access.redhat.com/solutions/253273.

2.2.1. Installing Virtualization Packages Manually

To use virtualization on Red Hat Enterprise Linux, at minimum, you need to install the following packages:
  • qemu-kvm: This package provides the user-level KVM emulator and facilitates communication between hosts and guest virtual machines.
  • qemu-img: This package provides disk management for guest virtual machines.

    Note

    The qemu-img package is installed as a dependency of the qemu-kvm package.
  • libvirt: This package provides the server and host-side libraries for interacting with hypervisors and host systems, and the libvirtd daemon that handles the library calls, manages virtual machines, and controls the hypervisor.
To install these packages, enter the following command:
# yum install qemu-kvm libvirt
Several additional virtualization management packages are also available and are recommended when using virtualization:
  • virt-install: This package provides the virt-install command for creating virtual machines from the command line.
  • libvirt-python: This package contains a module that permits applications written in the Python programming language to use the interface supplied by the libvirt API.
  • virt-manager: This package provides the virt-manager tool, also known as Virtual Machine Manager. This is a graphical tool for administering virtual machines. It uses the libvirt-client library as the management API.
  • libvirt-client: This package provides the client-side APIs and libraries for accessing libvirt servers. The libvirt-client package includes the virsh command-line tool to manage and control virtual machines and hypervisors from the command line or a special virtualization shell.
You can install all of these recommended virtualization packages with the following command:
# yum install virt-install libvirt-python virt-manager virt-install libvirt-client

2.2.2. Installing Virtualization Package Groups

The virtualization packages can also be installed from package groups. The following table describes the virtualization package groups and what they provide.

Table 2.1. Virtualization Package Groups

Package Group Description Mandatory Packages Optional Packages
Virtualization Hypervisor Smallest possible virtualization host installation libvirt, qemu-kvm, qemu-img qemu-kvm-tools
Virtualization Client Clients for installing and managing virtualization instances gnome-boxes, virt-install, virt-manager, virt-viewer, qemu-img virt-top, libguestfs-tools, libguestfs-tools-c
Virtualization Platform Provides an interface for accessing and controlling virtual machines and containers libvirt, libvirt-client, virt-who, qemu-img fence-virtd-libvirt, fence-virtd-multicast, fence-virtd-serial, libvirt-cim, libvirt-java, libvirt-snmp, perl-Sys-Virt
Virtualization Tools Tools for offline virtual image management libguestfs, qemu-img libguestfs-java, libguestfs-tools, libguestfs-tools-c
To install a package group, run the yum groupinstall package_group command. Use the --optional option to install the optional packages in the package group. For example, to install the Virtualization Tools package group with all of its optional packages, run:
# yum groupinstall "Virtualization Tools" --optional

Chapter 3. Creating a Virtual Machine

After you have installed the virtualization packages on your Red Hat Enterprise Linux 7 host system, you can create virtual machines and install guest operating systems using the virt-manager interface. Alternatively, you can use the virt-install command-line utility by a list of parameters or with a script. Both methods are covered by this chapter.

3.1. Guest Virtual Machine Deployment Considerations

Various factors should be considered before creating any guest virtual machines. The role of a virtual machine should be evaluated before deployment, but regular monitoring and assessment based on variable factors (load, amount of clients) should also be performed. The factors include:
Performance
Guest virtual machines should be deployed and configured based on their intended tasks. Some guest systems (for instance, guests running a database server) may require special performance considerations. Guests may require more assigned CPUs or memory based on their role and projected system load.
Input/Output requirements and types of Input/Output
Some guest virtual machines may have a particularly high I/O requirement or may require further considerations or projections based on the type of I/O (for instance, typical disk block size access, or the amount of clients).
Storage
Some guest virtual machines may require higher priority access to storage or faster disk types, or may require exclusive access to areas of storage. The amount of storage used by guests should also be regularly monitored and taken into account when deploying and maintaining storage. Make sure to read all the considerations outlined in Red Hat Enterprise Linux 7 Virtualization Security Guide. It is also important to understand that your physical storage may limit your options in your virtual storage.
Networking and network infrastructure
Depending upon your environment, some guest virtual machines could require faster network links than other guests. Bandwidth or latency are often factors when deploying and maintaining guests, especially as requirements or load changes.
Request requirements
SCSI requests can only be issued to guest virtual machines on virtio drives if the virtio drives are backed by whole disks, and the disk device parameter is set to lun in the domain XML file, as shown in the following example:
<devices>
  <emulator>/usr/libexec/qemu-kvm</emulator>
  <disk type='block' device='lun'>
                

3.2. Creating Guests with virt-install

You can use the virt-install command to create virtual machines and install operating system on those virtual machines from the command line. virt-install can be used either interactively or as part of a script to automate the creation of virtual machines. If you are using an interactive graphical installation, you must have virt-viewer installed before you run virt-install. In addition, you can start an unattended installation of virtual machine operating systems using virt-install with kickstart files.

Note

You might need root privileges in order for some virt-install commands to complete successfully.
The virt-install utility uses a number of command-line options. However, most virt-install options are not required.
The main required options for virtual guest machine installations are:
--name
The name of the virtual machine.
--memory
The amount of memory (RAM) to allocate to the guest, in MiB.
Guest storage
Use one of the following guest storage options:
  • --disk
    The storage configuration details for the virtual machine. If you use the --disk none option, the virtual machine is created with no disk space.
  • --filesystem
    The path to the file system for the virtual machine guest.
Installation method
Use one of the following installation methods:
  • --location
    The location of the installation media.
  • --cdrom
    The file or device used as a virtual CD-ROM device. It can be path to an ISO image, or to a CDROM device. It can also be a URL from which to fetch or access a minimal boot ISO image.
  • --pxe
    Uses the PXE boot protocol to load the initial ramdisk and kernel for starting the guest installation process.
  • --import
    Skips the OS installation process and builds a guest around an existing disk image. The device used for booting is the first device specified by the disk or filesystem option.
  • --boot
    The post-install VM boot configuration. This option allows specifying a boot device order, permanently booting off kernel and initrd with optional kernel arguments and enabling a BIOS boot menu.
To see a complete list of options, enter the following command:
# virt-install --help
To see a complete list of attributes for an option, enter the following command:
# virt install --option=?
The virt-install man page also documents each command option, important variables, and examples.
Prior to running virt-install, you may also need to use qemu-img to configure storage options. For instructions on using qemu-img, see Chapter 14, Using qemu-img.

3.2.1. Installing a virtual machine from an ISO image

The following example installs a virtual machine from an ISO image:
# virt-install \ 
  --name guest1-rhel7 \ 
  --memory 2048 \ 
  --vcpus 2 \ 
  --disk size=8 \ 
  --cdrom /path/to/rhel7.iso \ 
  --os-variant rhel7 
The --cdrom /path/to/rhel7.iso option specifies that the virtual machine will be installed from the CD or DVD image at the specified location.

3.2.2. Importing a virtual machine image

The following example imports a virtual machine from a virtual disk image:
# virt-install \ 
  --name guest1-rhel7 \ 
  --memory 2048 \ 
  --vcpus 2 \ 
  --disk /path/to/imported/disk.qcow \ 
  --import \ 
  --os-variant rhel7 
The --import option specifies that the virtual machine will be imported from the virtual disk image specified by the --disk /path/to/imported/disk.qcow option.

3.2.3. Installing a virtual machine from the network

The following example installs a virtual machine from a network location:
# virt-install \ 
  --name guest1-rhel7 \ 
  --memory 2048 \ 
  --vcpus 2 \ 
  --disk size=8 \ 
  --location http://example.com/path/to/os \ 
  --os-variant rhel7 
The --location http://example.com/path/to/os option specifies that the installation tree is at the specified network location.

3.2.4. Installing a virtual machine using PXE

When installing a virtual machine using the PXE boot protocol, both the --network option specifying a bridged network and the --pxe option must be specified.
The following example installs a virtual machine using PXE:
# virt-install \ 
  --name guest1-rhel7 \ 
  --memory 2048 \ 
  --vcpus 2 \ 
  --disk size=8 \ 
  --network=bridge:br0 \ 
  --pxe \ 
  --os-variant rhel7 

3.2.5. Installing a virtual machine with Kickstart

The following example installs a virtual machine using a kickstart file:
# virt-install \ 
  --name guest1-rhel7 \ 
  --memory 2048 \ 
  --vcpus 2 \ 
  --disk size=8 \ 
  --location http://example.com/path/to/os \ 
  --os-variant rhel7 \
  --initrd-inject /path/to/ks.cfg \ 
  --extra-args="ks=file:/ks.cfg console=tty0 console=ttyS0,115200n8" 
The initrd-inject and the extra-args options specify that the virtual machine will be installed using a Kickstarter file.

3.2.6. Configuring the guest virtual machine network during guest creation

When creating a guest virtual machine, you can specify and configure the network for the virtual machine. This section provides the options for each of the guest virtual machine main network types.

Default network with NAT

The default network uses libvirtd's network address translation (NAT) virtual network switch. For more information about NAT, see Section 6.1, “Network Address Translation (NAT) with libvirt”.
Before creating a guest virtual machine with the default network with NAT, ensure that the libvirt-daemon-config-network package is installed.
To configure a NAT network for the guest virtual machine, use the following option for virt-install:
--network default

Note

If no network option is specified, the guest virtual machine is configured with a default network with NAT.

Bridged network with DHCP

When configured for bridged networking, the guest uses an external DHCP server. This option should be used if the host has a static networking configuration and the guest requires full inbound and outbound connectivity with the local area network (LAN). It should be used if live migration will be performed with the guest virtual machine. To configure a bridged network with DHCP for the guest virtual machine, use the following option:
--network br0

Note

The bridge must be created separately, prior to running virt-install. For details on creating a network bridge, see Section 6.4.1, “Configuring Bridged Networking on a Red Hat Enterprise Linux 7 Host”.

Bridged network with a static IP address

Bridged networking can also be used to configure the guest to use a static IP address. To configure a bridged network with a static IP address for the guest virtual machine, use the following options:
--network br0 \
--extra-args "ip=192.168.1.2::192.168.1.1:255.255.255.0:test.example.com:eth0:none"
For more information on network booting options, see the Red Hat Enterprise Linux 7 Virtualization Guide.

No network

To configure a guest virtual machine with no network interface, use the following option:
--network=none

3.3. Creating Guests with virt-manager

The Virtual Machine Manager, also known as virt-manager, is a graphical tool for creating and managing guest virtual machines.
This section covers how to install a Red Hat Enterprise Linux 7 guest virtual machine on a Red Hat Enterprise Linux 7 host using virt-manager.
These procedures assume that the KVM hypervisor and all other required packages are installed and the host is configured for virtualization. For more information on installing the virtualization packages, see Chapter 2, Installing the Virtualization Packages.

3.3.1. virt-manager installation overview

The New VM wizard breaks down the virtual machine creation process into five steps:
  1. Choosing the hypervisor and installation type
  2. Locating and configuring the installation media
  3. Configuring memory and CPU options
  4. Configuring the virtual machine's storage
  5. Configuring virtual machine name, networking, architecture, and other hardware settings
Ensure that virt-manager can access the installation media (whether locally or over the network) before you continue.

3.3.2. Creating a Red Hat Enterprise Linux 7 Guest with virt-manager

This procedure covers creating a Red Hat Enterprise Linux 7 guest virtual machine with a locally stored installation DVD or DVD image. Red Hat Enterprise Linux 7 DVD images are available from the Red Hat Customer Portal.

Procedure 3.1. Creating a Red Hat Enterprise Linux 7 guest virtual machine with virt-manager using local installation media

  1. Optional: Preparation

    Prepare the storage environment for the virtual machine. For more information on preparing storage, see Chapter 13, Managing Storage for Virtual Machines.

    Important

    Various storage types may be used for storing guest virtual machines. However, for a virtual machine to be able to use migration features, the virtual machine must be created on networked storage.
    Red Hat Enterprise Linux 7 requires at least 1 GB of storage space. However, Red Hat recommends at least 5 GB of storage space for a Red Hat Enterprise Linux 7 installation and for the procedures in this guide.
  2. Open virt-manager and start the wizard

    Open virt-manager by executing the virt-manager command as root or opening ApplicationsSystem ToolsVirtual Machine Manager.
    The Virtual Machine Manager window

    Figure 3.1. The Virtual Machine Manager window

    Optionally, open a remote hypervisor by selecting the hypervisor and clicking the Connect button.
    Click to start the new virtualized guest wizard.
    The New VM window opens.
  3. Specify installation type

    Select the installation type:
    Local install media (ISO image or CDROM)
    This method uses a CD-ROM, DVD, or image of an installation disk (for example, .iso).
    Network Install (HTTP, FTP, or NFS)
    This method involves the use of a mirrored Red Hat Enterprise Linux or Fedora installation tree to install a guest. The installation tree must be accessible through either HTTP, FTP, or NFS.
    If you select Network Install, provide the installation URL and also Kernel options, if required.
    Network Boot (PXE)
    This method uses a Preboot eXecution Environment (PXE) server to install the guest virtual machine. Setting up a PXE server is covered in the Red Hat Enterprise Linux 7 Installation Guide. To install via network boot, the guest must have a routable IP address or shared network device.
    If you select Network Boot, continue to STEP 5. After all steps are completed, a DHCP request is sent and if a valid PXE server is found the guest virtual machine's installation processes will start.
    Import existing disk image
    This method allows you to create a new guest virtual machine and import a disk image (containing a pre-installed, bootable operating system) to it.
    Virtual machine installation method

    Figure 3.2. Virtual machine installation method

    Click Forward to continue.
  4. Select the installation source

    1. If you selected Local install media (ISO image or CDROM), specify your intended local installation media.
      Local ISO image installation

      Figure 3.3. Local ISO image installation

      • If you wish to install from a CD-ROM or DVD, select the Use CDROM or DVD radio button, and select the appropriate disk drive from the drop-down list of drives available.
      • If you wish to install from an ISO image, select Use ISO image, and then click the Browse... button to open the Locate media volume window.
        Select the installation image you wish to use, and click Choose Volume.
        If no images are displayed in the Locate media volume window, click the Browse Local button to browse the host machine for the installation image or DVD drive containing the installation disk. Select the installation image or DVD drive containing the installation disk and click Open; the volume is selected for use and you are returned to the Create a new virtual machine wizard.

        Important

        For ISO image files and guest storage images, the recommended location to use is /var/lib/libvirt/images/. Any other location may require additional configuration by SELinux. See the Red Hat Enterprise Linux Virtualization Security Guide or the Red Hat Enterprise Linux SELinux User's and Administrator's Guide for more details on configuring SELinux.
    2. If you selected Network Install, input the URL of the installation source and also the required Kernel options, if any. The URL must point to the root directory of an installation tree, which must be accessible through either HTTP, FTP, or NFS.
      To perform a kickstart installation, specify the URL of a kickstart file in Kernel options, starting with ks=
      Network kickstart installation

      Figure 3.4. Network kickstart installation

      Note

      For a complete list of kernel options, see the Red Hat Enterprise Linux 7 Installation Guide.
    Next, configure the OS type and Version of the installation. Ensure that you select the appropriate operating system type for your virtual machine. This can be specified manually or by selecting the Automatically detect operating system based on install media check box.
    Click Forward to continue.
  5. Configure memory (RAM) and virtual CPUs

    Specify the number of CPUs and amount of memory (RAM) to allocate to the virtual machine. The wizard shows the number of CPUs and amount of memory you can allocate; these values affect the host's and guest's performance.
    Virtual machines require sufficient physical memory (RAM) to run efficiently and effectively. Red Hat supports a minimum of 512MB of RAM for a virtual machine. Red Hat recommends at least 1024MB of RAM for each logical core.
    Assign sufficient virtual CPUs for the virtual machine. If the virtual machine runs a multi-threaded application, assign the number of virtual CPUs the guest virtual machine will require to run efficiently.
    You cannot assign more virtual CPUs than there are physical processors (or hyper-threads) available on the host system. The number of virtual CPUs available is noted in the Up to X available field.
    Configuring Memory and CPU

    Figure 3.5. Configuring Memory and CPU

    After you have configured the memory and CPU settings, click Forward to continue.

    Note

    Memory and virtual CPUs can be overcommitted. For more information on overcommitting, see Chapter 7, Overcommitting with KVM.
  6. Configure storage

    Enable and assign sufficient space for your virtual machine and any applications it requires. Assign at least 5 GB for a desktop installation or at least 1 GB for a minimal installation.
    Configuring virtual storage

    Figure 3.6. Configuring virtual storage

    Note

    Live and offline migrations require virtual machines to be installed on shared network storage. For information on setting up shared storage for virtual machines, see Section 15.4, “Shared Storage Example: NFS for a Simple Migration”.
    1. With the default local storage

      Select the Create a disk image on the computer's hard drive radio button to create a file-based image in the default storage pool, the /var/lib/libvirt/images/ directory. Enter the size of the disk image to be created. If the Allocate entire disk now check box is selected, a disk image of the size specified will be created immediately. If not, the disk image will grow as it becomes filled.

      Note

      Although the storage pool is a virtual container it is limited by two factors: maximum size allowed to it by qemu-kvm and the size of the disk on the host physical machine. Storage pools may not exceed the size of the disk on the host physical machine. The maximum sizes are as follows:
      • virtio-blk = 2^63 bytes or 8 Exabytes(using raw files or disk)
      • Ext4 = ~ 16 TB (using 4 KB block size)
      • XFS = ~8 Exabytes
      • qcow2 and host file systems keep their own metadata and scalability should be evaluated/tuned when trying very large image sizes. Using raw disks means fewer layers that could affect scalability or max size.
      Click Forward to create a disk image on the local hard drive. Alternatively, select Select managed or other existing storage, then select Browse to configure managed storage.
    2. With a storage pool

      If you select Select managed or other existing storage to use a storage pool, click Browse to open the Locate or create storage volume window.
      The Choose Storage Volume window

      Figure 3.7. The Choose Storage Volume window

      1. Select a storage pool from the Storage Pools list.
      2. Optional: Click to create a new storage volume. The Add a Storage Volume screen will appear. Enter the name of the new storage volume.
        Choose a format option from the Format drop-down menu. Format options include raw, qcow2, and qed. Adjust other fields as needed. Note that the qcow2 version used here is version 3. To change the qcow version see Section 23.20.2, “Setting Target Elements”
        The Add a Storage Volume window

        Figure 3.8. The Add a Storage Volume window

    Select the new volume and click Choose volume. Next, click Finish to return to the New VM wizard. Click Forward to continue.
  7. Name and final configuration

    Name the virtual machine. Virtual machine names can contain letters, numbers and the following characters: underscores (_), periods (.), and hyphens (-). Virtual machine names must be unique for migration and cannot consist only of numbers.
    By default, the virtual machine will be created with network address translation (NAT) for a network called 'default' . To change the network selection, click Network selection and select a host device and source mode.
    Verify the settings of the virtual machine and click Finish when you are satisfied; this will create the virtual machine with specified networking settings, virtualization type, and architecture.
    Verifying the configuration

    Figure 3.9. Verifying the configuration

    Or, to further configure the virtual machine's hardware, check the Customize configuration before install check box to change the guest's storage or network devices, to use the paravirtualized (virtio) drivers or to add additional devices. This opens another wizard that will allow you to add, remove, and configure the virtual machine's hardware settings.

    Note

    Red Hat Enterprise Linux 4 or Red Hat Enterprise Linux 5 guest virtual machines cannot be installed using graphical mode. As such, you must select "Cirrus" instead of "QXL" as a video card.
    After configuring the virtual machine's hardware, click Apply. virt-manager will then create the virtual machine with your specified hardware settings.

    Warning

    When installing a Red Hat Enterprise Linux 7 guest virtual machine from a remote medium but without a configured TCP/IP connection, the installation fails. However, when installing a guest virtual machine of Red Hat Enterprise Linux 5 or 6 in such circumstances, the installer opens a "Configure TCP/IP" interface.
    For further information about this difference, see the related knowledgebase article.
    Click Finish to continue into the Red Hat Enterprise Linux installation sequence. For more information on installing Red Hat Enterprise Linux 7, see the Red Hat Enterprise Linux 7 Installation Guide.
A Red Hat Enterprise Linux 7 guest virtual machine is now created from an ISO installation disk image.

3.4. Comparison of virt-install and virt-manager Installation options

This table provides a quick reference to compare equivalent virt-install and virt-manager installation options for when installing a virtual machine.
Most virt-install options are not required. The minimum requirements are --name, --memory, guest storage (--disk, --filesystem or --disk none), and an install method (--location, --cdrom, --pxe, --import, or boot). These options are further specified with arguments; to see a complete list of command options and related arguments, enter the following command:
# virt-install --help
In virt-manager, at minimum, a name, installation method, memory (RAM), vCPUs, storage are required.

Table 3.1. virt-install and virt-manager configuration comparison for guest installations

Configuration on virtual machine virt-install option virt-manager installation wizard label and step number
Virtual machine name --name, -n Name (step 5)
RAM to allocate (MiB) --ram, -r Memory (RAM) (step 3)
Storage - specify storage media --disk Enable storage for this virtual machine → Create a disk image on the computer's hard drive, or Select managed or other existing storage (step 4)
Storage - export a host directory to the guest --filesystem Enable storage for this virtual machine → Select managed or other existing storage (step 4)
Storage - configure no local disk storage on the guest --nodisks Deselect the Enable storage for this virtual machine check box (step 4)
Installation media location (local install) --file Local install media → Locate your install media (steps 1-2)
Installation via distribution tree (network install) --location Network install → URL (steps 1-2)
Install guest with PXE --pxe Network boot (step 1)
Number of vCPUs --vcpus CPUs (step 3)
Host network --network Advanced options drop-down menu (step 5)
Operating system variant/version --os-variant Version (step 2)
Graphical display method --graphics, --nographics * virt-manager provides GUI installation only

Chapter 4. Cloning Virtual Machines

There are two types of guest virtual machine instances used in creating guest copies:
  • Clones are instances of a single virtual machine. Clones can be used to set up a network of identical virtual machines, and they can also be distributed to other destinations.
  • Templates are instances of a virtual machine that are designed to be used as a source for cloning. You can create multiple clones from a template and make minor modifications to each clone. This is useful in seeing the effects of these changes on the system.
Both clones and templates are virtual machine instances. The difference between them is in how they are used.
For the created clone to work properly, information and configurations unique to the virtual machine that is being cloned usually has to be removed before cloning. The information that needs to be removed differs, based on how the clones will be used.
The information and configurations to be removed may be on any of the following levels:
  • Platform level information and configurations include anything assigned to the virtual machine by the virtualization solution. Examples include the number of Network Interface Cards (NICs) and their MAC addresses.
  • Guest operating system level information and configurations include anything configured within the virtual machine. Examples include SSH keys.
  • Application level information and configurations include anything configured by an application installed on the virtual machine. Examples include activation codes and registration information.

    Note

    This chapter does not include information about removing the application level, because the information and approach is specific to each application.
As a result, some of the information and configurations must be removed from within the virtual machine, while other information and configurations must be removed from the virtual machine using the virtualization environment (for example, Virtual Machine Manager or VMware).

Note

For information on cloning storage volumes, see Section 13.3.2.1, “Creating Storage Volumes with virsh”.

4.1. Preparing Virtual Machines for Cloning

Before cloning a virtual machine, it must be prepared by running the virt-sysprep utility on its disk image, or by using the following steps:

Procedure 4.1. Preparing a virtual machine for cloning

  1. Setup the virtual machine

    1. Build the virtual machine that is to be used for the clone or template.
      • Install any software needed on the clone.
      • Configure any non-unique settings for the operating system.
      • Configure any non-unique application settings.
  2. Remove the network configuration

    1. Remove any persistent udev rules using the following command:
      # rm -f /etc/udev/rules.d/70-persistent-net.rules

      Note

      If udev rules are not removed, the name of the first NIC may be eth1 instead of eth0.
    2. Remove unique network details from ifcfg scripts by making the following edits to /etc/sysconfig/network-scripts/ifcfg-eth[x]:
      1. Remove the HWADDR and Static lines

        Note

        If the HWADDR does not match the new guest's MAC address, the ifcfg will be ignored. Therefore, it is important to remove the HWADDR from the file.
        DEVICE=eth[x]
        BOOTPROTO=none
        ONBOOT=yes
        #NETWORK=10.0.1.0       <- REMOVE
        #NETMASK=255.255.255.0  <- REMOVE
        #IPADDR=10.0.1.20       <- REMOVE
        #HWADDR=xx:xx:xx:xx:xx  <- REMOVE
        #USERCTL=no             <- REMOVE
        # Remove any other *unique* or non-desired settings, such as UUID.
        
      2. Ensure that a DHCP configuration remains that does not include a HWADDR or any unique information.
        DEVICE=eth[x]
        BOOTPROTO=dhcp
        ONBOOT=yes
        
      3. Ensure that the file includes the following lines:
        DEVICE=eth[x]
        ONBOOT=yes
        
    3. If the following files exist, ensure that they contain the same content:
      • /etc/sysconfig/networking/devices/ifcfg-eth[x]
      • /etc/sysconfig/networking/profiles/default/ifcfg-eth[x]

      Note

      If NetworkManager or any special settings were used with the virtual machine, ensure that any additional unique information is removed from the ifcfg scripts.
  3. Remove registration details

    1. Remove registration details using one of the following:
      • For Red Hat Network (RHN) registered guest virtual machines, run the following command:
        # rm /etc/sysconfig/rhn/systemid
      • For Red Hat Subscription Manager (RHSM) registered guest virtual machines:
        • If the original virtual machine will not be used, run the following commands:
          # subscription-manager unsubscribe --all
          # subscription-manager unregister
          # subscription-manager clean
        • If the original virtual machine will be used, run only the following command:
          # subscription-manager clean

          Note

          The original RHSM profile remains in the portal.
  4. Removing other unique details

    1. Remove any sshd public/private key pairs using the following command:
      # rm -rf /etc/ssh/ssh_host_*

      Note

      Removing ssh keys prevents problems with ssh clients not trusting these hosts.
    2. Remove any other application-specific identifiers or configurations that may cause conflicts if running on multiple machines.
  5. Configure the virtual machine to run configuration wizards on the next boot

    1. Configure the virtual machine to run the relevant configuration wizards the next time it is booted by doing one of the following:
      • For Red Hat Enterprise Linux 6 and below, create an empty file on the root file system called .unconfigured using the following command:
        # touch /.unconfigured
      • For Red Hat Enterprise Linux 7, enable the first boot and initial-setup wizards by running the following commands:
        # sed -ie 's/RUN_FIRSTBOOT=NO/RUN_FIRSTBOOT=YES/' /etc/sysconfig/firstboot
        # systemctl enable firstboot-graphical
        # systemctl enable initial-setup-graphical

      Note

      The wizards that run on the next boot depend on the configurations that have been removed from the virtual machine. In addition, on the first boot of the clone, it is recommended that you change the hostname.

4.2. Cloning a Virtual Machine

Before proceeding with cloning, shut down the virtual machine. You can clone the virtual machine using virt-clone or virt-manager.

4.2.1. Cloning Guests with virt-clone

You can use virt-clone to clone virtual machines from the command line.
Note that you need root privileges for virt-clone to complete successfully.
The virt-clone command provides a number of options that can be passed on the command line. These include general options, storage configuration options, networking configuration options, and miscellaneous options. Only the --original is required. To see a complete list of options, enter the following command:
# virt-clone --help
The virt-clone man page also documents each command option, important variables, and examples.
The following example shows how to clone a guest virtual machine called "demo" on the default connection, automatically generating a new name and disk clone path.

Example 4.1. Using virt-clone to clone a guest

# virt-clone --original demo --auto-clone
The following example shows how to clone a QEMU guest virtual machine called "demo" with multiple disks.

Example 4.2. Using virt-clone to clone a guest

# virt-clone --connect qemu:///system --original demo --name newdemo --file /var/lib/libvirt/images/newdemo.img --file /var/lib/libvirt/images/newdata.img

4.2.2. Cloning Guests with virt-manager

This procedure describes cloning a guest virtual machine using the virt-manager utility.

Procedure 4.2. Cloning a Virtual Machine with virt-manager

  1. Open virt-manager

    Start virt-manager. Launch the Virtual Machine Manager application from the Applications menu and System Tools submenu. Alternatively, run the virt-manager command as root.
    Select the guest virtual machine you want to clone from the list of guest virtual machines in Virtual Machine Manager.
    Right-click the guest virtual machine you want to clone and select Clone. The Clone Virtual Machine window opens.
    Clone Virtual Machine window

    Figure 4.1. Clone Virtual Machine window

  2. Configure the clone

    • To change the name of the clone, enter a new name for the clone.
    • To change the networking configuration, click Details.
      Enter a new MAC address for the clone.
      Click OK.
      Change MAC Address window

      Figure 4.2. Change MAC Address window

    • For each disk in the cloned guest virtual machine, select one of the following options:
      • Clone this disk - The disk will be cloned for the cloned guest virtual machine
      • Share disk with guest virtual machine name - The disk will be shared by the guest virtual machine that will be cloned and its clone
      • Details - Opens the Change storage path window, which enables selecting a new path for the disk
        Change storage path window

        Figure 4.3. Change storage path window

  3. Clone the guest virtual machine

    Click Clone.

Chapter 5. KVM Paravirtualized (virtio) Drivers

Paravirtualized drivers enhance the performance of guests, decreasing guest I/O latency and increasing throughput almost to bare-metal levels. It is recommended to use the paravirtualized drivers for fully virtualized guests running I/O-heavy tasks and applications.
Virtio drivers are KVM's paravirtualized device drivers, available for guest virtual machines running on KVM hosts. These drivers are included in the virtio package. The virtio package supports block (storage) devices and network interface controllers.

Note

PCI devices are limited by the virtualized system architecture. See Chapter 16, Guest Virtual Machine Device Configuration for additional limitations when using assigned devices.

5.1. Using KVM virtio Drivers for Existing Storage Devices

You can modify an existing hard disk device attached to the guest to use the virtio driver instead of the virtualized IDE driver. The example shown in this section edits libvirt configuration files. Note that the guest virtual machine does not need to be shut down to perform these steps, however the change will not be applied until the guest is completely shut down and rebooted.

Procedure 5.1. Using KVM virtio drivers for existing devices

  1. Ensure that you have installed the appropriate driver (viostor), before continuing with this procedure.
  2. Run the virsh edit guestname command as root to edit the XML configuration file for your device. For example, virsh edit guest1. The configuration files are located in the /etc/libvirt/qemu/ directory.
  3. Below is a file-based block device using the virtualized IDE driver. This is a typical entry for a virtual machine not using the virtio drivers.
    <disk type='file' device='disk'>
    	 ...
       <source file='/var/lib/libvirt/images/disk1.img'/>
       <target dev='hda' bus='ide'/>
    	 <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
  4. Change the entry to use the virtio device by modifying the bus= entry to virtio. Note that if the disk was previously IDE, it has a target similar to hda, hdb, or hdc. When changing to bus=virtio the target needs to be changed to vda, vdb, or vdc accordingly.
    <disk type='file' device='disk'>
       ...
       <source file='/var/lib/libvirt/images/disk1.img'/>
       <target dev='vda' bus='virtio'/>
    	 <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
  5. Remove the address tag inside the disk tags. This must be done for this procedure to work. Libvirt will regenerate the address tag appropriately the next time the virtual machine is started.
Alternatively, virt-manager, virsh attach-disk or virsh attach-interface can add a new device using the virtio drivers.
See the libvirt website for more details on using Virtio: http://www.linux-kvm.org/page/Virtio

5.2. Using KVM virtio Drivers for New Storage Devices

This procedure covers creating new storage devices using the KVM virtio drivers with virt-manager.
Alternatively, the virsh attach-disk or virsh attach-interface commands can be used to attach devices using the virtio drivers.

Important

Ensure the drivers have been installed on the guest before proceeding to install new devices. If the drivers are unavailable the device will not be recognized and will not work.

Procedure 5.2. Adding a storage device using the virtio storage driver

  1. Open the guest virtual machine by double clicking the name of the guest in virt-manager.
  2. Open the Show virtual hardware details tab by clicking .
  3. In the Show virtual hardware details tab, click the Add Hardware button.
  4. Select hardware type

    Select Storage as the Hardware type.
    The Add new virtual hardware wizard with Storage selected as the hardware type.

    Figure 5.1. The Add new virtual hardware wizard

  5. Select the storage device and driver

    Create a new disk image or select a storage pool volume.
    Set the Device type to Disk device and the Bus type to VirtIO to use the virtio drivers.
    The Add new virtual hardware wizard Storage window, with "Create a disk image on the computer's hard drive" selected.

    Figure 5.2. The Add New Virtual Hardware wizard

    Click Finish to complete the procedure.

Procedure 5.3. Adding a network device using the virtio network driver

  1. Open the guest virtual machine by double clicking the name of the guest in virt-manager.
  2. Open the Show virtual hardware details tab by clicking .
  3. In the Show virtual hardware details tab, click the Add Hardware button.
  4. Select hardware type

    Select Network as the Hardware type.
    The Add new virtual hardware wizard with Network selected as the hardware type.

    Figure 5.3. The Add new virtual hardware wizard

  5. Select the network device and driver

    Set the Device model to virtio to use the virtio drivers. Choose the required Host device.
    The Add new virtual hardware wizard Network window, with Device model set to virtio.

    Figure 5.4. The Add new virtual hardware wizard

    Click Finish to complete the procedure.
Once all new devices are added, reboot the virtual machine. Virtual machines may not recognize the devices until the guest is rebooted.

5.3. Using KVM virtio Drivers for Network Interface Devices

When network interfaces use KVM virtio drivers, KVM does not emulate networking hardware which removes processing overhead and can increase the guest performance. In Red Hat Enterprise Linux 7, virtio is used as the default network interface type. However, if this is configured differently on your system, you can use the following procedures:
  • To attach a virtio network device to a guest, use the virsh attach-interface command with the model --virtio option.
    Alternatively, in the virt-manager interface, navigate to the guest's Virtual hardware details screen and click Add Hardware. In the Add New Virtual Hardware screen, select Network, and change Device model to virtio:
  • To change the type of an existing interface to virtio, use the virsh edit command to edit the XML configuration of the intended guest, and change the model type attribute to virtio, for example as follows:
      <devices>
        <interface type='network'>
          <source network='default'/>
          <target dev='vnet1'/>
          <model type='virtio'/>
          <driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off'/>
        </interface>
      </devices>
      ...
    Alternatively, in the virt-manager interface, navigate to the guest's Virtual hardware details screen, select the NIC item, and change Model type to virtio:

Note

If the naming of network interfaces inside the guest is not consistent across reboots, ensure all interfaces presented to the guest are of the same device model, preferably virtio-net. For details, see the Red Hat KnowledgeBase.

Chapter 6. Network Configuration

This chapter provides an introduction to the common networking configurations used by libvirt-based guest virtual machines.
Red Hat Enterprise Linux 7 supports the following networking setups for virtualization:
  • virtual networks using Network Address Translation (NAT)
  • directly allocated physical devices using PCI device assignment
  • directly allocated virtual functions using PCIe SR-IOV
  • bridged networks
You must enable NAT, network bridging or directly assign a PCI device to allow external hosts access to network services on guest virtual machines.

6.1. Network Address Translation (NAT) with libvirt

One of the most common methods for sharing network connections is to use Network Address Translation (NAT) forwarding (also known as virtual networks).
Host Configuration

Every standard libvirt installation provides NAT-based connectivity to virtual machines as the default virtual network. Verify that it is available with the virsh net-list --all command.

# virsh net-list --all
Name                 State      Autostart
-----------------------------------------
default              active     yes
If it is missing, the following can be used in the XML configuration file (such as /etc/libvirtd/qemu/myguest.xml) for the guest:
# ll /etc/libvirt/qemu/
total 12
drwx------. 3 root root 4096 Nov  7 23:02 networks
-rw-------. 1 root root 2205 Nov 20 01:20 r6.4.xml
-rw-------. 1 root root 2208 Nov  8 03:19 r6.xml
The default network is defined from /etc/libvirt/qemu/networks/default.xml
Mark the default network to automatically start:
# virsh net-autostart default
Network default marked as autostarted
Start the default network:
# virsh net-start default
Network default started
Once the libvirt default network is running, you will see an isolated bridge device. This device does not have any physical interfaces added. The new device uses NAT and IP forwarding to connect to the physical network. Do not add new interfaces.
# brctl show
bridge name     bridge id               STP enabled     interfaces
virbr0          8000.000000000000       yes
libvirt adds iptables rules which allow traffic to and from guest virtual machines attached to the virbr0 device in the INPUT, FORWARD, OUTPUT and POSTROUTING chains. libvirt then attempts to enable the ip_forward parameter. Some other applications may disable ip_forward, so the best option is to add the following to /etc/sysctl.conf.
 net.ipv4.ip_forward = 1
Guest Virtual Machine Configuration

Once the host configuration is complete, a guest virtual machine can be connected to the virtual network based on its name. To connect a guest to the 'default' virtual network, the following can be used in the XML configuration file (such as /etc/libvirtd/qemu/myguest.xml) for the guest:

<interface type='network'>
   <source network='default'/>
</interface>

Note

Defining a MAC address is optional. If you do not define one, a MAC address is automatically generated and used as the MAC address of the bridge device used by the network. Manually setting the MAC address may be useful to maintain consistency or easy reference throughout your environment, or to avoid the very small chance of a conflict.
<interface type='network'>
  <source network='default'/>
  <mac address='00:16:3e:1a:b3:4a'/>
</interface>

6.2. Disabling vhost-net

The vhost-net module is a kernel-level back end for virtio networking that reduces virtualization overhead by moving virtio packet processing tasks out of user space (the QEMU process) and into the kernel (the vhost-net driver). vhost-net is only available for virtio network interfaces. If the vhost-net kernel module is loaded, it is enabled by default for all virtio interfaces, but can be disabled in the interface configuration if a particular workload experiences a degradation in performance when vhost-net is in use.
Specifically, when UDP traffic is sent from a host machine to a guest virtual machine on that host, performance degradation can occur if the guest virtual machine processes incoming data at a rate slower than the host machine sends it. In this situation, enabling vhost-net causes the UDP socket's receive buffer to overflow more quickly, which results in greater packet loss. It is therefore better to disable vhost-net in this situation to slow the traffic, and improve overall performance.
To disable vhost-net, edit the <interface> sub-element in the guest virtual machine's XML configuration file and define the network as follows:
<interface type="network">
   ...
   <model type="virtio"/>
   <driver name="qemu"/>
   ...
</interface>
Setting the driver name to qemu forces packet processing into QEMU user space, effectively disabling vhost-net for that interface.

6.3. Enabling vhost-net zero-copy

In Red Hat Enterprise Linux 7, vhost-net zero-copy is disabled by default. To enable this action on a permanent basis, add a new file vhost-net.conf to /etc/modprobe.d with the following content:
options vhost_net  experimental_zcopytx=1
If you want to disable this again, you can run the following:
modprobe -r vhost_net
modprobe vhost_net experimental_zcopytx=0
The first command removes the old file, the second one makes a new file (like above) and disables zero-copy. You can use this to enable as well but the change will not be permanent.
To confirm that this has taken effect, check the output of cat /sys/module/vhost_net/parameters/experimental_zcopytx. It should show:
$ cat /sys/module/vhost_net/parameters/experimental_zcopytx
0

6.4. Bridged Networking

Bridged networking (also known as network bridging or virtual network switching) is used to place virtual machine network interfaces on the same network as the physical interface. Bridges require minimal configuration and make a virtual machine appear on an existing network, which reduces management overhead and network complexity. As bridges contain few components and configuration variables, they provide a transparent setup which is straightforward to understand and troubleshoot, if required.
Bridging can be configured in a virtualized environment using standard Red Hat Enterprise Linux tools, virt-manager, or libvirt, and is described in the following sections.
However, even in a virtualized environment, bridges may be more easily created using the host operating system's networking tools. More information about this bridge creation method can be found in the Red Hat Enterprise Linux 7 Networking Guide.

6.4.1. Configuring Bridged Networking on a Red Hat Enterprise Linux 7 Host

Bridged networking can be configured for virtual machines on a Red Hat Enterprise Linux host, independent of the virtualization management tools. This configuration is mainly recommended when the virtualization bridge is the host's only network interface, or is the host's management network interface.
For instructions on configuring network bridging without using virtualization tools, see the Red Hat Enterprise Linux 7 Networking Guide.

6.4.2. Bridged Networking with Virtual Machine Manager

This section provides instructions on creating a bridge from a host machine's interface to a guest virtual machine using virt-manager.

Note

Depending on your environment, setting up a bridge with libvirt tools in Red Hat Enterprise Linux 7 may require disabling Network Manager, which is not recommended by Red Hat. A bridge created with libvirt also requires libvirtd to be running for the bridge to maintain network connectivity.
It is recommended to configure bridged networking on the physical Red Hat Enterprise Linux host as described in the Red Hat Enterprise Linux 7 Networking Guide, while using libvirt after bridge creation to add virtual machine interfaces to the bridges.

Procedure 6.1. Creating a bridge with virt-manager

  1. From the virt-manager main menu, click Edit ⇒ Connection Details to open the Connection Details window.
  2. Click the Network Interfaces tab.
  3. Click the + at the bottom of the window to configure a new network interface.
  4. In the Interface type drop-down menu, select Bridge, and then click Forward to continue.
    Adding a bridge

    Figure 6.1. Adding a bridge

    1. In the Name field, enter a name for the bridge, such as br0.
    2. Select a Start mode from the drop-down menu. Choose from one of the following:
      • none - deactivates the bridge
      • onboot - activates the bridge on the next guest virtual machine reboot
      • hotplug - activates the bridge even if the guest virtual machine is running
    3. Check the Activate now check box to activate the bridge immediately.
    4. To configure either the IP settings or Bridge settings, click the appropriate Configure button. A separate window will open to specify the required settings. Make any necessary changes and click OK when done.
    5. Select the physical interface to connect to your virtual machines. If the interface is currently in use by another guest virtual machine, you will receive a warning message.
  5. Click Finish and the wizard closes, taking you back to the Connections menu.
    Adding a bridge

    Figure 6.2. Adding a bridge

Select the bridge to use, and click Apply to exit the wizard.
To stop the interface, click the Stop Interface key. Once the bridge is stopped, to delete the interface, click the Delete Interface key.

6.4.3. Bridged Networking with libvirt

Depending on your environment, setting up a bridge with libvirt in Red Hat Enterprise Linux 7 may require disabling Network Manager, which is not recommended by Red Hat. This also requires libvirtd to be running for the bridge to operate.
It is recommended to configure bridged networking on the physical Red Hat Enterprise Linux host as described in the Red Hat Enterprise Linux 7 Networking Guide.

Important

libvirt is now able to take advantage of new kernel tunable parameters to manage host bridge forwarding database (FDB) entries, thus potentially improving system network performance when bridging multiple virtual machines. Set the macTableManager attribute of a network's <bridge> element to 'libvirt' in the host's XML configuration file:
<bridge name='br0' macTableManager='libvirt'/>
This will turn off learning (flood) mode on all bridge ports, and libvirt will add or remove entries to the FDB as necessary. Along with removing the overhead of learning the proper forwarding ports for MAC addresses, this also allows the kernel to disable promiscuous mode on the physical device that connects the bridge to the network, which further reduces overhead.

Chapter 7. Overcommitting with KVM

7.1. Introduction

The KVM hypervisor automatically overcommits CPUs and memory. This means that more virtualized CPUs and memory can be allocated to virtual machines than there are physical resources on the system. This is possible because most processes do not access 100% of their allocated resources all the time.
As a result, under-utilized virtualized servers or desktops can run on fewer hosts, which saves a number of system resources, with the net effect of less power, cooling, and investment in server hardware.

7.2. Overcommitting Memory

Guest virtual machines running on a KVM hypervisor do not have dedicated blocks of physical RAM assigned to them. Instead, each guest virtual machine functions as a Linux process where the host physical machine's Linux kernel allocates memory only when requested. In addition the host's memory manager can move the guest virtual machine's memory between its own physical memory and swap space.
Overcommitting requires allotting sufficient swap space on the host physical machine to accommodate all guest virtual machines as well as enough memory for the host physical machine's processes. As a basic rule, the host physical machine's operating system requires a maximum of 4GB of memory along with a minimum of 4GB of swap space. For advanced instructions on determining an appropriate size for the swap partition, see the Red Hat KnowledgeBase.

Important

Overcommitting is not an ideal solution for general memory issues. The recommended methods to deal with memory shortage are to allocate less memory per guest, add more physical memory to the host, or utilize swap space.
A virtual machine will run slower if it is swapped frequently. In addition, overcommitting can cause the system to run out of memory (OOM), which may lead to the Linux kernel shutting down important system processes. If you decide to overcommit memory, ensure sufficient testing is performed. Contact Red Hat support for assistance with overcommitting.
Overcommitting does not work with all virtual machines, but has been found to work in a desktop virtualization setup with minimal intensive usage or running several identical guests with KSM. For more information on KSM and overcommitting, see the Red Hat Enterprise Linux 7 Virtualization Tuning and Optimization Guide.

Important

When device assignment is in use, all virtual machine memory must be statically pre-allocated to enable direct memory access (DMA) with the assigned device. Memory overcommit is therefore not supported with device assignment.

7.3. Overcommitting Virtualized CPUs

The KVM hypervisor supports overcommitting virtualized CPUs (vCPUs). Virtualized CPUs can be overcommitted as far as load limits of guest virtual machines allow. Use caution when overcommitting vCPUs, as loads near 100% may cause dropped requests or unusable response times.
In Red Hat Enterprise Linux 7, it is possible to overcommit guests with more than one vCPU, known as symmetric multiprocessing (SMP) virtual machines. However, you may experience performance deterioration when running more cores on the virtual machine than are present on your physical CPU.
For example, a virtual machine with four vCPUs should not be run on a host machine with a dual core processor, but on a quad core host. Overcommitting SMP virtual machines beyond the physical number of processing cores causes significant performance degradation, due to programs getting less CPU time than required. In addition, it is not recommended to have more than 10 total allocated vCPUs per physical processor core.
With SMP guests, some processing overhead is inherent. CPU overcommitting can increase the SMP overhead, because using time-slicing to allocate resources to guests can make inter-CPU communication inside a guest slower. This overhead increases with guests that have a larger number of vCPUs, or a larger overcommit ratio.
Virtualized CPUs are overcommitted best when when a single host has multiple guests, and each guest has a small number of vCPUs, compared to the number of host CPUs. KVM should safely support guests with loads under 100% at a ratio of five vCPUs (on 5 virtual machines) to one physical CPU on one single host. The KVM hypervisor will switch between all of the virtual machines, making sure that the load is balanced.
For best performance, Red Hat recommends assigning guests only as many vCPUs as are required to run the programs that are inside each guest.

Important

Applications that use 100% of memory or processing resources may become unstable in overcommitted environments. Do not overcommit memory or CPUs in a production environment without extensive testing, as the CPU overcommit ratio and the amount of SMP are workload-dependent.

Chapter 8. KVM Guest Timing Management

Virtualization involves several challenges for time keeping in guest virtual machines.
  • Interrupts cannot always be delivered simultaneously and instantaneously to all guest virtual machines. This is because interrupts in virtual machines are not true interrupts. Instead, they are injected into the guest virtual machine by the host machine.
  • The host may be running another guest virtual machine, or a different process. Therefore, the precise timing typically required by interrupts may not always be possible.
Guest virtual machines without accurate time keeping may experience issues with network applications and processes, as session validity, migration, and other network activities rely on timestamps to remain correct.
KVM avoids these issues by providing guest virtual machines with a paravirtualized clock (kvm-clock). However, it is still important to test timing before attempting activities that may be affected by time keeping inaccuracies, such as guest migration.

Important

To avoid the problems described above, the Network Time Protocol (NTP) should be configured on the host and the guest virtual machines. On guests using Red Hat Enterprise Linux 6 and earlier, NTP is implemented by the ntpd service. For more information, see the Red Hat Enterprise 6 Deployment Guide.
On systems using Red Hat Enterprise Linux 7, NTP time synchronization service can be provided by ntpd or by the chronyd service. Note that Chrony has some advantages on virtual machines. For more information, see the Configuring NTP Using the chrony Suite and Configuring NTP Using ntpd sections in the Red Hat Enterprise Linux 7 System Administrator's Guide.
The mechanics of guest virtual machine time synchronization

By default, the guest synchronizes its time with the hypervisor as follows:

  • When the guest system boots, the guest reads the time from the emulated Real Time Clock (RTC).
  • When the NTP protocol is initiated, it automatically synchronizes the guest clock. Afterwards, during normal guest operation, NTP performs clock adjustments in the guest.
  • When a guest is resumed after a pause or a restoration process, a command to synchronize the guest clock to a specified value should be issued by the management software (such as virt-manager). This synchronization works only if the QEMU guest agent is installed in the guest and supports the feature. The value to which the guest clock synchronizes is usually the host clock value.

Constant Time Stamp Counter (TSC)

Modern Intel and AMD CPUs provide a constant Time Stamp Counter (TSC). The count frequency of the constant TSC does not vary when the CPU core itself changes frequency, for example to comply with a power-saving policy. A CPU with a constant TSC frequency is necessary in order to use the TSC as a clock source for KVM guests.

Your CPU has a constant Time Stamp Counter if the constant_tsc flag is present. To determine if your CPU has the constant_tsc flag enter the following command:
$ cat /proc/cpuinfo | grep constant_tsc
If any output is given, your CPU has the constant_tsc bit. If no output is given, follow the instructions below.
Configuring Hosts without a Constant Time Stamp Counter

Systems without a constant TSC frequency cannot use the TSC as a clock source for virtual machines, and require additional configuration. Power management features interfere with accurate time keeping and must be disabled for guest virtual machines to accurately keep time with KVM.

Important

These instructions are for AMD revision F CPUs only.
If the CPU lacks the constant_tsc bit, disable all power management features . Each system has several timers it uses to keep time. The TSC is not stable on the host, which is sometimes caused by cpufreq changes, deep C state, or migration to a host with a faster TSC. Deep C sleep states can stop the TSC. To prevent the kernel using deep C states append processor.max_cstate=1 to the kernel boot. To make this change persistent, edit values of the GRUB_CMDLINE_LINUX key in the /etc/default/grubfile. For example. if you want to enable emergency mode for each boot, edit the entry as follows:
GRUB_CMDLINE_LINUX="emergency"
Note that you can specify multiple parameters for the GRUB_CMDLINE_LINUX key, similarly to adding the parameters in the GRUB 2 boot menu.
To disable cpufreq (only necessary on hosts without the constant_tsc), install kernel-tools and enable the cpupower.service (systemctl enable cpupower.service). If you want to disable this service every time the guest virtual machine boots, change the configuration file in /etc/sysconfig/cpupower and change the CPUPOWER_START_OPTS and CPUPOWER_STOP_OPTS. Valid limits can be found in the /sys/devices/system/cpu/cpuid/cpufreq/scaling_available_governors files. For more information on this package or on power management and governors, see the Red Hat Enterprise Linux 7 Power Management Guide.

8.1. Host-wide Time Synchronization

Virtual network devices in KVM guests do not support hardware timestamping, which means it is difficult to synchronize the clocks of guests that use a network protocol like NTP or PTP with better accuracy than tens of microseconds.
When a more accurate synchronization of the guests is required, it is recommended to synchronize the clock of the host using NTP or PTP with hardware timestamping, and to synchronize the guests to the host directly. Red Hat Enterprise Linux 7.5 and later provide a virtual PTP hardware clock (PHC), which enables the guests to synchronize to the host with a sub-microsecond accuracy.
To enable the PHC device, do the following:
  1. Set the ptp_kvm module to load after reboot.
    # echo ptp_kvm > /etc/modules-load.d/ptp_kvm.conf
  2. Add the /dev/ptp0 clock as a reference to the chrony configuration:
    # echo "refclock PHC /dev/ptp0 poll 2" >> /etc/chrony.conf
  3. Restart the chrony daemon:
    # systemctl restart chronyd
  4. To verify the host-guest time synchronization has been configured correctly, use the chronyc sources command on a guest. The output should look similar to the following:
    # chronyc sources
    210 Number of sources = 1
    MS Name/IP address         Stratum Poll Reach LastRx Last sample
    ===============================================================================
    #* PHC0                          0   2   377     4     -6ns[   -6ns] +/-  726ns
    

8.2. Required Time Management Parameters for Red Hat Enterprise Linux Guests

For certain Red Hat Enterprise Linux guest virtual machines, additional kernel parameters are required for their system time to be synchronised correctly. These parameters can be set by appending them to the end of the /kernel line in the /etc/grub2.cfg file of the guest virtual machine.

Note

Red Hat Enterprise Linux 5.5 and later, Red Hat Enterprise Linux 6.0 and later, and Red Hat Enterprise Linux 7 use kvm-clock as their default clock source. Running kvm-clock avoids the need for additional kernel parameters, and is recommended by Red Hat.
The table below lists versions of Red Hat Enterprise Linux and the parameters required on the specified systems.

Table 8.1. Kernel parameter requirements

Red Hat Enterprise Linux version Additional guest kernel parameters
7.0 and later on AMD64 and Intel 64 systems with kvm-clock Additional parameters are not required
6.1 and later on AMD64 and Intel 64 systems with kvm-clock Additional parameters are not required
6.0 on AMD64 and Intel 64 systems with kvm-clock Additional parameters are not required
6.0 on AMD64 and Intel 64 systems without kvm-clock notsc lpj=n

Note

The lpj parameter requires a numeric value equal to the loops per jiffy value of the specific CPU on which the guest virtual machine runs. If you do not know this value, do not set the lpj parameter.

8.3. Steal Time Accounting

Steal time is the amount of CPU time needed by a guest virtual machine that is not provided by the host. Steal time occurs when the host allocates these resources elsewhere: for example, to another guest.
Steal time is reported in the CPU time fields in /proc/stat. It is automatically reported by utilities such as top and vmstat. It is displayed as "%st", or in the "st" column. Note that it cannot be switched off.
Large amounts of steal time indicate CPU contention, which can reduce guest performance. To relieve CPU contention, increase the guest's CPU priority or CPU quota, or run fewer guests on the host.

Chapter 9. Network Booting with libvirt

Guest virtual machines can be booted with PXE enabled. PXE allows guest virtual machines to boot and load their configuration off the network itself. This section demonstrates some basic configuration steps to configure PXE guests with libvirt.
This section does not cover the creation of boot images or PXE servers. It is used to explain how to configure libvirt, in a private or bridged network, to boot a guest virtual machine with PXE booting enabled.

Warning

These procedures are provided only as an example. Ensure that you have sufficient backups before proceeding.

9.1. Preparing the Boot Server

To perform the steps in this chapter you will need:
  • A PXE Server (DHCP and TFTP) - This can be a libvirt internal server, manually-configured dhcpd and tftpd, dnsmasq, a server configured by Cobbler, or some other server.
  • Boot images - for example, PXELINUX configured manually or by Cobbler.

9.1.1. Setting up a PXE Boot Server on a Private libvirt Network

This example uses the default network. Perform the following steps:

Procedure 9.1. Configuring the PXE boot server

  1. Place the PXE boot images and configuration in /var/lib/tftpboot.
  2. enter the following commands:
    # virsh net-destroy default
    # virsh net-edit default
  3. Edit the <ip> element in the configuration file for the default network to include the appropriate address, network mask, DHCP address range, and boot file, where BOOT_FILENAME represents the file name you are using to boot the guest virtual machine.
    <ip address='192.168.122.1' netmask='255.255.255.0'>
       <tftp root='/var/lib/tftpboot' />
       <dhcp>
          <range start='192.168.122.2' end='192.168.122.254' />
          <bootp file='BOOT_FILENAME' />
       </dhcp>
    </ip>
  4. Run:
    # virsh net-start default
  5. Boot the guest using PXE (refer to Section 9.2, “Booting a Guest Using PXE”).

9.2. Booting a Guest Using PXE

This section demonstrates how to boot a guest virtual machine with PXE.

9.2.1. Using bridged networking

Procedure 9.2. Booting a guest using PXE and bridged networking

  1. Ensure bridging is enabled such that the PXE boot server is available on the network.
  2. Boot a guest virtual machine with PXE booting enabled. You can use the virt-install command to create a new virtual machine with PXE booting enabled, as shown in the following example command:
    virt-install --pxe --network bridge=breth0 --prompt
    Alternatively, ensure that the guest network is configured to use your bridged network, and that the XML guest configuration file has a <boot dev='network'/> element inside the <os> element, as shown in the following example:
    <os>
       <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
       <boot dev='network'/>
       <boot dev='hd'/>
    </os>
    <interface type='bridge'>
       <mac address='52:54:00:5a:ad:cb'/>
       <source bridge='breth0'/>
       <target dev='vnet0'/>
       <alias name='net0'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

9.2.2. Using a Private libvirt Network

Procedure 9.3. Using a private libvirt network

  1. Boot a guest virtual machine using libvirt with PXE booting enabled. You can use the virt-install command to create/install a new virtual machine using PXE:
    virt-install --pxe --network network=default --prompt
Alternatively, ensure that the guest network is configured to use your private libvirt network, and that the XML guest configuration file has a <boot dev='network'/> element inside the <os> element, as shown in the following example:
<os>
   <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
   <boot dev='network'/>
   <boot dev='hd'/>
</os>
Also ensure that the guest virtual machine is connected to the private network:
<interface type='network'>
   <mac address='52:54:00:66:79:14'/>
   <source network='default'/>
   <target dev='vnet0'/>
   <alias name='net0'/>
   <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>

Chapter 10. Registering the Hypervisor and Virtual Machine

Red Hat Enterprise Linux 6 and 7 require that every guest virtual machine is mapped to a specific hypervisor in order to ensure that every guest is allocated the same level of subscription service. To do this you need to install a subscription agent that automatically detects all guest Virtual Machines (VMs) on each KVM hypervisor that is installed and registered, which in turn will create a mapping file that sits on the host. This mapping file ensures that all guest VMs receive the following benefits:
  • Subscriptions specific to virtual systems are readily available and can be applied to all of the associated guest VMs.
  • All subscription benefits that can be inherited from the hypervisor are readily available and can be applied to all of the associated guest VMs.

Note

The information provided in this chapter is specific to Red Hat Enterprise Linux subscriptions only. If you also have a Red Hat Virtualization subscription, or a Red Hat Satellite subscription, you should also consult the virt-who information provided with those subscriptions. More information on Red Hat Subscription Management can also be found in the Red Hat Subscription Management Guide found on the customer portal.

10.1. Installing virt-who on the Host Physical Machine

  1. Register the KVM hypervisor

    Register the KVM Hypervisor by running the subscription-manager register [options] command in a terminal as the root user on the host physical machine. More options are available using the # subscription-manager register --help menu. In cases where you are using a user name and password, use the credentials that are known to the subscription manager. If this is your very first time subscribing and you do not have a user account, contact customer support. For example to register the VM as 'admin' with 'secret' as a password, you would send the following command:
    [root@rhel-server ~]# subscription-manager register --username=admin --password=secret --auto-attach --type=hypervisor
  2. Install the virt-who packages

    Install the virt-who packages, by running the following command on the host physical machine:
    # yum install virt-who
  3. Create a virt-who configuration file

    For each hypervisor, add a configuration file in the /etc/virt-who.d/ directory. At a minimum, the file must contain the following snippet:
    [libvirt]
    type=libvirt
    
    For more detailed information on configuring virt-who, see Section 10.1.1, “Configuring virt-who.
  4. Start the virt-who service

    Start the virt-who service by running the following command on the host physical machine:
    # systemctl start virt-who.service
    # systemctl enable virt-who.service
  5. Confirm virt-who service is receiving guest information

    At this point, the virt-who service will start collecting a list of domains from the host. Check the /var/log/rhsm/rhsm.log file on the host physical machine to confirm that the file contains a list of the guest VMs. For example:
    2015-05-28 12:33:31,424 DEBUG: Libvirt domains found: [{'guestId': '58d59128-cfbb-4f2c-93de-230307db2ce0', 'attributes': {'active': 0, 'virtWhoType': 'libvirt', 'hypervisorType': 'QEMU'}, 'state': 5}]
    

Procedure 10.1. Managing the subscription on the customer portal

  1. Subscribing the hypervisor

    As the virtual machines will be receiving the same subscription benefits as the hypervisor, it is important that the hypervisor has a valid subscription and that the subscription is available for the VMs to use.
    1. Login to the customer portal

      Login to the Red Hat customer portal https://access.redhat.com/ and click the Subscriptions button at the top of the page.
    2. Click the Systems link

      In the Subscriber Inventory section (towards the bottom of the page), click Systems link.
    3. Select the hypervisor

      On the Systems page, there is a table of all subscribed systems. Click the name of the hypervisor (localhost.localdomain for example). In the details page that opens, click Attach a subscription and select all the subscriptions listed. Click Attach Selected. This will attach the host's physical subscription to the hypervisor so that the guests can benefit from the subscription.
  2. Subscribing the guest virtual machines - first time use

    This step is for those who have a new subscription and have never subscribed a guest virtual machine before. If you are adding virtual machines, skip this step. To consume the subscription assigned to the hypervisor profile on the machine running the virt-who service, auto subscribe by running the following command in a terminal, on the guest virtual machine as root.
    [root@virt-who ~]# subscription-manager attach --auto
  3. Subscribing additional guest virtual machines

    If you just subscribed a for the first time, skip this step. If you are adding additional virtual machines, note that running this command will not necessarily re-attach the same subscriptions to the guest virtual machine. This is because removing all subscriptions then allowing auto-attach to resolve what is necessary for a given guest virtual machine may result in different subscriptions consumed than before. This may not have any effect on your system, but it is something you should be aware about. If you used a manual attachment procedure to attach the virtual machine, which is not described below, you will need to re-attach those virtual machines manually as the auto-attach will not work. Use the following command to first remove the subscriptions for the old guests, and then use the auto-attach to attach subscriptions to all the guests. Run these commands on the guest virtual machine.
    [root@virt-who ~]# subscription-manager remove --all
    [root@virt-who ~]# subscription-manager attach --auto
  4. Confirm subscriptions are attached

    Confirm that the subscription is attached to the hypervisor by running the following command on the guest virtual machine:
    [root@virt-who ~]# subscription-manager list --consumed
    Output similar to the following will be displayed. Pay attention to the Subscription Details. It should say 'Subscription is current'.
    [root@virt-who ~]#  subscription-manager list --consumed
    +-------------------------------------------+
       Consumed Subscriptions
    +-------------------------------------------+
    Subscription Name:	Awesome OS with unlimited virtual guests
    Provides: 		Awesome OS Server Bits
    SKU: 			awesomeos-virt-unlimited
    Contract: 		0
    Account: 		######### Your account number #####
    Serial: 		######### Your serial number ######
    Pool ID: 		XYZ123                                             1
    Provides Management: 	No
    Active: 		True
    Quantity Used: 		1
    Service Level:
    Service Type:
    Status Details:		Subscription is current                       2
    Subscription Type:
    Starts: 		01/01/2015
    Ends: 			12/31/2015
    System Type: 		Virtual
    

    1

    The ID for the subscription to attach to the system is displayed here. You will need this ID if you need to attach the subscription manually.

    2

    Indicates if your subscription is current. If your subscription is not current, an error message appears. One example is Guest has not been reported on any host and is using a temporary unmapped guest subscription. In this case the guest needs to be subscribed. In other cases, use the information as indicated in Section 10.5.2, “I have subscription status errors, what do I do?”.
  5. Register additional guests

    When you install new guest VMs on the hypervisor, you must register the new VM and use the subscription attached to the hypervisor, by running the following commands on the guest virtual machine:
    # subscription-manager register
    # subscription-manager attach --auto
    # subscription-manager list --consumed

10.1.1. Configuring virt-who

The virt-who service is configured using the following files:
  • /etc/virt-who.conf - Contains general configuration information including the interval for checking connected hypervisors for changes.
  • /etc/virt-who.d/hypervisor_name.conf - Contains configuration information for a specific hypervisor.
A web-based wizard is provided to generate hypervisor configuration files and the snippets required for virt-who.conf. To run the wizard, browse to Red Hat Virtualization Agent (virt-who) Configuration Helper on the Customer Portal.
On the second page of the wizard, select the following options:
  • Where does your virt-who report to?: Subscription Asset Manager
  • Hypervisor Type: libvirt
Follow the wizard to complete the configuration. If the configuration is performed correctly, virt-who will automatically provide the selected subscriptions to existing and future guests on the specified hypervisor.
For more information on hypervisor configuration files, see the virt-who-config man page.

10.2. Registering a New Guest Virtual Machine

In cases where a new guest virtual machine is to be created on a host that is already registered and running, the virt-who service must also be running. This ensures that the virt-who service maps the guest to a hypervisor, so the system is properly registered as a virtual system. To register the virtual machine, enter the following command:
[root@virt-server ~]# subscription-manager register --username=admin --password=secret --auto-attach

10.3. Removing a Guest Virtual Machine Entry

If the guest virtual machine is running, unregister the system, by running the following command in a terminal window as root on the guest:
[root@virt-guest ~]# subscription-manager unregister
If the system has been deleted, however, the virtual service cannot tell whether the service is deleted or paused. In that case, you must manually remove the system from the server side, using the following steps:
  1. Login to the Subscription Manager

    The Subscription Manager is located on the Red Hat Customer Portal. Login to the Customer Portal using your user name and password, by clicking the login icon at the top of the screen.
  2. Click the Subscriptions tab

    Click the Subscriptions tab.
  3. Click the Systems link

    Scroll down the page and click the Systems link.
  4. Delete the system

    To delete the system profile, locate the specified system's profile in the table, select the check box beside its name and click Delete.

10.4. Installing virt-who Manually

This section will describe how to manually attach the subscription provided by the hypervisor.

Procedure 10.2. How to attach a subscription manually

  1. List subscription information and find the Pool ID

    First you need to list the available subscriptions which are of the virtual type. Enter the following command:
    [root@server1 ~]# subscription-manager list --avail --match-installed | grep 'Virtual' -B12
    Subscription Name: Red Hat Enterprise Linux ES (Basic for Virtualization)
    Provides:          Red Hat Beta
                       Oracle Java (for RHEL Server)
                       Red Hat Enterprise Linux Server
    SKU:               -------
    Pool ID:           XYZ123
    Available:         40
    Suggested:         1
    Service Level:     Basic
    Service Type:      L1-L3
    Multi-Entitlement: No
    Ends:              01/02/2017
    System Type:       Virtual
    
    Note the Pool ID displayed. Copy this ID as you will need it in the next step.
  2. Attach the subscription with the Pool ID

    Using the Pool ID you copied in the previous step run the attach command. Replace the Pool ID XYZ123 with the Pool ID you retrieved. Enter the following command:
    [root@server1 ~]# subscription-manager attach --pool=XYZ123
    
    Successfully attached a subscription for: Red Hat Enterprise Linux ES (Basic for Virtualization)
    

10.5. Troubleshooting virt-who

10.5.1. Why is the hypervisor status red?

Scenario: On the server side, you deploy a guest on a hypervisor that does not have a subscription. 24 hours later, the hypervisor displays its status as red. To remedy this situation you must get a subscription for that hypervisor. Or, permanently migrate the guest to a hypervisor with a subscription.

10.5.2. I have subscription status errors, what do I do?

Scenario: Any of the following error messages display:
  • System not properly subscribed
  • Status unknown
  • Late binding of a guest to a hypervisor through virt-who (host/guest mapping)
To find the reason for the error open the virt-who log file, named rhsm.log, located in the /var/log/rhsm/ directory.

Chapter 11. Enhancing Virtualization with the QEMU Guest Agent and SPICE Agent

Agents in Red Hat Enterprise Linux such as the QEMU guest agent and the SPICE agent can be deployed to help the virtualization tools run more optimally on your system. These agents are described in this chapter.

Note

To further optimize and tune host and guest performance, see the Red Hat Enterprise Linux 7 Virtualization Tuning and Optimization Guide.

11.1. QEMU Guest Agent

The QEMU guest agent runs inside the guest and allows the host machine to issue commands to the guest operating system using libvirt, helping with functions such as freezing and thawing filesystems. The guest operating system then responds to those commands asynchronously. The QEMU guest agent package, qemu-guest-agent, is installed by default in Red Hat Enterprise Linux 7.
This section covers the libvirt commands and options available to the guest agent.

Important

Note that it is only safe to rely on the QEMU guest agent when run by trusted guests. An untrusted guest may maliciously ignore or abuse the guest agent protocol, and although built-in safeguards exist to prevent a denial of service attack on the host, the host requires guest co-operation for operations to run as expected.
Note that QEMU guest agent can be used to enable and disable virtual CPUs (vCPUs) while the guest is running, thus adjusting the number of vCPUs without using the hot plug and hot unplug features. For more information, see Section 20.36.6, “Configuring Virtual CPU Count”.

11.1.1. Setting up Communication between the QEMU Guest Agent and Host

The host machine communicates with the QEMU guest agent through a VirtIO serial connection between the host and guest machines. A VirtIO serial channel is connected to the host via a character device driver (typically a Unix socket), and the guest listens on this serial channel.

Note

The qemu-guest-agent does not detect if the host is listening to the VirtIO serial channel. However, as the current use for this channel is to listen for host-to-guest events, the probability of a guest virtual machine running into problems by writing to the channel with no listener is very low. Additionally, the qemu-guest-agent protocol includes synchronization markers that allow the host physical machine to force a guest virtual machine back into sync when issuing a command, and libvirt already uses these markers, so that guest virtual machines are able to safely discard any earlier pending undelivered responses.

11.1.1.1. Configuring the QEMU Guest Agent on a Linux Guest

The QEMU guest agent can be configured on a running or shut down virtual machine. If configured on a running guest, the guest will start using the guest agent immediately. If the guest is shut down, the QEMU guest agent will be enabled at next boot.
Either virsh or virt-manager can be used to configure communication between the guest and the QEMU guest agent. The following instructions describe how to configure the QEMU guest agent on a Linux guest.

Procedure 11.1. Setting up communication between guest agent and host with virsh on a shut down Linux guest

  1. Shut down the virtual machine

    Ensure the virtual machine (named rhel7 in this example) is shut down before configuring the QEMU guest agent:
    # virsh shutdown rhel7 
  2. Add the QEMU guest agent channel to the guest XML configuration

    Edit the guest's XML file to add the QEMU guest agent details:
    # virsh edit rhel7
    Add the following to the guest's XML file and save the changes:
    <channel type='unix'>
       <target type='virtio' name='org.qemu.guest_agent.0'/>
    </channel>
  3. Start the virtual machine

    # virsh start rhel7
  4. Install the QEMU guest agent on the guest

    Install the QEMU guest agent if not yet installed in the guest virtual machine:
    # yum install qemu-guest-agent
  5. Start the QEMU guest agent in the guest

    Start the QEMU guest agent service in the guest:
    # systemctl start qemu-guest-agent
Alternatively, the QEMU guest agent can be configured on a running guest with the following steps:

Procedure 11.2. Setting up communication between guest agent and host on a running Linux guest

  1. Create an XML file for the QEMU guest agent

    # cat agent.xml
    <channel type='unix'>
       <target type='virtio' name='org.qemu.guest_agent.0'/>
    </channel>
  2. Attach the QEMU guest agent to the virtual machine

    Attach the QEMU guest agent to the running virtual machine (named rhel7 in this example) with this command:
    # virsh attach-device rhel7 agent.xml
  3. Install the QEMU guest agent on the guest

    Install the QEMU guest agent if not yet installed in the guest virtual machine:
    # yum install qemu-guest-agent
  4. Start the QEMU guest agent in the guest

    Start the QEMU guest agent service in the guest:
    # systemctl start qemu-guest-agent

Procedure 11.3. Setting up communication between the QEMU guest agent and host with virt-manager

  1. Shut down the virtual machine

    Ensure the virtual machine is shut down before configuring the QEMU guest agent.
    To shut down the virtual machine, select it from the list of virtual machines in Virtual Machine Manager, then click the light switch icon from the menu bar.
  2. Add the QEMU guest agent channel to the guest

    Open the virtual machine's hardware details by clicking the lightbulb icon at the top of the guest window.
    Click the Add Hardware button to open the Add New Virtual Hardware window, and select Channel.
    Select the QEMU guest agent from the Name drop-down list and click Finish:
    Selecting the QEMU guest agent channel device

    Figure 11.1. Selecting the QEMU guest agent channel device

  3. Start the virtual machine

    To start the virtual machine, select it from the list of virtual machines in Virtual Machine Manager, then click on the menu bar.
  4. Install the QEMU guest agent on the guest

    Open the guest with virt-manager and install the QEMU guest agent if not yet installed in the guest virtual machine:
    # yum install qemu-guest-agent
  5. Start the QEMU guest agent in the guest

    Start the QEMU guest agent service in the guest:
    # systemctl start qemu-guest-agent
The QEMU guest agent is now configured on the rhel7 virtual machine.

11.2. Using the QEMU Guest Agent with libvirt

Installing the QEMU guest agent allows various libvirt commands to become more powerful. The guest agent enhances the following virsh commands:
  • virsh shutdown --mode=agent - This shutdown method is more reliable than virsh shutdown --mode=acpi, as virsh shutdown used with the QEMU guest agent is guaranteed to shut down a cooperative guest in a clean state. If the agent is not present, libvirt must instead rely on injecting an ACPI shutdown event, but some guests ignore that event and thus will not shut down.
    Can be used with the same syntax for virsh reboot.
  • virsh snapshot-create --quiesce - Allows the guest to flush its I/O into a stable state before the snapshot is created, which allows use of the snapshot without having to perform a fsck or losing partial database transactions. The guest agent allows a high level of disk contents stability by providing guest co-operation.
  • virsh domfsfreeze and virsh domfsthaw - Quiesces the guest filesystem in isolation.
  • virsh domfstrim - Instructs the guest to trim its filesystem.
  • virsh domtime - Queries or sets the guest's clock.
  • virsh setvcpus --guest - Instructs the guest to take CPUs offline.
  • virsh domifaddr --source agent - Queries the guest operating system's IP address via the guest agent.
  • virsh domfsinfo - Shows a list of mounted filesystems within the running guest.
  • virsh set-user-password - Sets the password for a user account in the guest.

11.2.1. Creating a Guest Disk Backup

libvirt can communicate with qemu-guest-agent to ensure that snapshots of guest virtual machine file systems are consistent internally and ready to use as needed. Guest system administrators can write and install application-specific freeze/thaw hook scripts. Before freezing the filesystems, the qemu-guest-agent invokes the main hook script (included in the qemu-guest-agent package). The freezing process temporarily deactivates all guest virtual machine applications.
The snapshot process is comprised of the following steps:
  • File system applications / databases flush working buffers to the virtual disk and stop accepting client connections
  • Applications bring their data files into a consistent state
  • Main hook script returns
  • qemu-guest-agent freezes the filesystems and the management stack takes a snapshot
  • Snapshot is confirmed
  • Filesystem function resumes
Thawing happens in reverse order.
To create a snapshot of the guest's file system, run the virsh snapshot-create --quiesce --disk-only command (alternatively, run virsh snapshot-create-as guest_name --quiesce --disk-only, explained in further detail in Section 20.39.2, “Creating a Snapshot for the Current Guest Virtual Machine”).

Note

An application-specific hook script might need various SELinux permissions in order to run correctly, as is done when the script needs to connect to a socket in order to talk to a database. In general, local SELinux policies should be developed and installed for such purposes. Accessing file system nodes should work out of the box, after issuing the restorecon -FvvR command listed in Table 11.1, “QEMU guest agent package contents” in the table row labeled /etc/qemu-ga/fsfreeze-hook.d/.
The qemu-guest-agent binary RPM includes the following files:

Table 11.1. QEMU guest agent package contents

File name Description
/usr/lib/systemd/system/qemu-guest-agent.service Service control script (start/stop) for the QEMU guest agent.
/etc/sysconfig/qemu-ga Configuration file for the QEMU guest agent, as it is read by the /usr/lib/systemd/system/qemu-guest-agent.service control script. The settings are documented in the file with shell script comments.
/usr/bin/qemu-ga QEMU guest agent binary file.
/etc/qemu-ga Root directory for hook scripts.
/etc/qemu-ga/fsfreeze-hook Main hook script. No modifications are needed here.
/etc/qemu-ga/fsfreeze-hook.d Directory for individual, application-specific hook scripts. The guest system administrator should copy hook scripts manually into this directory, ensure proper file mode bits for them, and then run restorecon -FvvR on this directory.
/usr/share/qemu-kvm/qemu-ga/ Directory with sample scripts (for example purposes only). The scripts contained here are not executed.
The main hook script, /etc/qemu-ga/fsfreeze-hook logs its own messages, as well as the application-specific script's standard output and error messages, in the following log file: /var/log/qemu-ga/fsfreeze-hook.log. For more information, see the libvirt upstream website.

11.3. SPICE Agent

The SPICE agent helps run graphical applications such as virt-manager more smoothly, by helping integrate the guest operating system with the SPICE client.
For example, when resizing a window in virt-manager, the SPICE agent allows for automatic X session resolution adjustment to the client resolution. The SPICE agent also provides support for copy and paste between the host and guest, and prevents mouse cursor lag.
For system-specific information on the SPICE agent's capabilities, see the spice-vdagent package's README file.

11.3.1. Setting up Communication between the SPICE Agent and Host

The SPICE agent can be configured on a running or shut down virtual machine. If configured on a running guest, the guest will start using the guest agent immediately. If the guest is shut down, the SPICE agent will be enabled at next boot.
Either virsh or virt-manager can be used to configure communication between the guest and the SPICE agent. The following instructions describe how to configure the SPICE agent on a Linux guest.

Procedure 11.4. Setting up communication between guest agent and host with virsh on a Linux guest

  1. Shut down the virtual machine

    Ensure the virtual machine (named rhel7 in this example) is shut down before configuring the SPICE agent:
    # virsh shutdown rhel7 
  2. Add the SPICE agent channel to the guest XML configuration

    Edit the guest's XML file to add the SPICE agent details:
    # virsh edit rhel7
    Add the following to the guest's XML file and save the changes:
    <channel type='spicevmc'>
       <target type='virtio' name='com.redhat.spice.0'/>
    </channel>
  3. Start the virtual machine

    # virsh start rhel7
  4. Install the SPICE agent on the guest

    Install the SPICE agent if not yet installed in the guest virtual machine:
    # yum install spice-vdagent
  5. Start the SPICE agent in the guest

    Start the SPICE agent service in the guest:
    # systemctl start spice-vdagent
Alternatively, the SPICE agent can be configured on a running guest with the following steps:

Procedure 11.5. Setting up communication between SPICE agent and host on a running Linux guest

  1. Create an XML file for the SPICE agent

    # cat agent.xml
    <channel type='spicevmc'>
       <target type='virtio' name='com.redhat.spice.0'/>
    </channel>
  2. Attach the SPICE agent to the virtual machine

    Attach the SPICE agent to the running virtual machine (named rhel7 in this example) with this command:
    # virsh attach-device rhel7 agent.xml
  3. Install the SPICE agent on the guest

    Install the SPICE agent if not yet installed in the guest virtual machine:
    # yum install spice-vdagent
  4. Start the SPICE agent in the guest

    Start the SPICE agent service in the guest:
    # systemctl start spice-vdagent

Procedure 11.6. Setting up communication between the SPICE agent and host with virt-manager

  1. Shut down the virtual machine

    Ensure the virtual machine is shut down before configuring the SPICE agent.
    To shut down the virtual machine, select it from the list of virtual machines in Virtual Machine Manager, then click the light switch icon from the menu bar.
  2. Add the SPICE agent channel to the guest

    Open the virtual machine's hardware details by clicking the lightbulb icon at the top of the guest window.
    Click the Add Hardware button to open the Add New Virtual Hardware window, and select Channel.
    Select the SPICE agent from the Name drop-down list, edit the channel address, and click Finish:
    Selecting the SPICE agent channel device

    Figure 11.2. Selecting the SPICE agent channel device

  3. Start the virtual machine

    To start the virtual machine, select it from the list of virtual machines in Virtual Machine Manager, then click on the menu bar.
  4. Install the SPICE agent on the guest

    Open the guest with virt-manager and install the SPICE agent if not yet installed in the guest virtual machine:
    # yum install spice-vdagent
  5. Start the SPICE agent in the guest

    Start the SPICE agent service in the guest:
    # systemctl start spice-vdagent
The SPICE agent is now configured on the rhel7 virtual machine.

Chapter 12. Nested Virtualization

12.1. Overview

As of Red Hat Enterprise Linux 7.5, nested virtualization is available as a Technology Preview for KVM guest virtual machines. With this feature, a guest virtual machine (also referred to as level 1 or L1) that runs on a physical host (level 0 or L0) can act as a hypervisor, and create its own guest virtual machines (L2).
Nested virtualization is useful in a variety of scenarios, such as debugging hypervisors in a constrained environment and testing larger virtual deployments on a limited amount of physical resources. However, note that nested virtualization is not supported or recommended in production user environments, and is primarily intended for development and testing.
Nested virtualization relies on host virtualization extensions to function, and it should not to be confused with running guests in a virtual environment using the QEMU Tiny Code Generator (TCG) emulation, which is not supported in Red Hat Enterprise Linux.

12.2. Setup

Follow these steps to enable, configure, and start using nested virtualization:
  1. Enable: The feature is disabled by default. To enable it, use the following procedure on the L0 host physical machine.
    For Intel:
    1. Check whether nested virtualization is available on your host system.
      $ cat /sys/module/kvm_intel/parameters/nested
      If this command returns Y or 1, the feature is enabled.
      If the command returns 0 or N, use steps b and c.
    2. Unload the kvm_intel module:
      # modprobe -r kvm_intel
    3. Activate the nesting feature:
      # modprobe kvm_intel nested=1
    4. The nesting feature is now enabled only until the next reboot of the L0 host. To enable it permanently, add the following line to the /etc/modprobe.d/kvm.conf file:
      options kvm_intel nested=1
    For AMD:
    1. Check whether nested virtualization is available on your system:
      $ cat /sys/module/kvm_amd/parameters/nested
      If this command returns Y or 1, the feature is enabled.
      If the command returns 0 or N, use steps b and c.
    2. Unload the kvm_amd module
      # modprobe -r kvm_amd
    3. Activate the nesting feature
      # modprobe kvm_amd nested=1
    4. The nesting feature is now enabled only until the next reboot of the L0 host. To enable it permanently, add the following line to the /etc/modprobe.d/kvm.conf file:
      options kvm_amd nested=1
  2. Configure your L1 virtual machine for nested virtualization using one of the following methods:
    virt-manager
    1. Open the GUI of the intended guest and click the Show Virtual Hardware Details icon.
    2. Select the Processor menu, and in the Configuration section, type host-passthrough in the Model field (do not use the drop-down selection), and click Apply.
    Domain XML
    Add the following line to the domain XML file of the guest:
    <cpu mode='host-passthrough'/>
    If the guest's XML configuration file already contains a <cpu> element, rewrite it.
  3. To start using nested virtualization, install an L2 guest within the L1 guest. To do this, follow the same procedure as when installing the L1 guest - see Chapter 3, Creating a Virtual Machine for more information.

12.3. Restrictions and Limitations

  • It is strongly recommended to run Red Hat Enterprise Linux 7.2 or later in the L0 host and the L1 guests. L2 guests can contain any guest system supported by Red Hat.
  • It is not supported to migrate L1 or L2 guests.
  • Use of L2 guests as hypervisors and creating L3 guests is not supported.
  • Not all features available on the host are available to be utilized by the L1 hypervisor. For instance, IOMMU/VT-d or APICv cannot be used by the L1 hypervisor.
  • To use nested virtualization, the host CPU must have the necessary feature flags. To determine if the L0 and L1 hypervisors are set up correctly, use the cat /proc/cpuinfo command on both L0 and L1, and make sure that the following flags are listed for the respective CPUs on both hypervisors:
    • For Intel - vmx (Hardware Virtualization) and ept (Extended Page Tables)
    • For AMD - svm (equivalent to vmx) and npt (equivalent to ept)

Part II. Administration

Chapter 13. Managing Storage for Virtual Machines

This chapter provides information about storage for virtual machines. Virtual storage is abstracted from the physical storage allocated to a virtual machine connection. The storage is attached to the virtual machine using paravirtualized or emulated block device drivers.

13.1. Storage Concepts

A storage pool is a quantity of storage set aside for use by guest virtual machines. Storage pools are divided into storage volumes. Each storage volume is assigned to a guest virtual machine as a block device on a guest bus.
Storage pools and volumes are managed using libvirt. With libvirt's remote protocol, it is possible to manage all aspects of a guest virtual machine's life cycle, as well as the configuration of the resources required by the guest virtual machine. These operations can be performed on a remote host. As a result, a management application, such as the Virtual Machine Manager, using libvirt can enable a user to perform all the required tasks for configuring the host physical machine for a guest virtual machine. These include allocating resources, running the guest virtual machine, shutting it down, and de-allocating the resources, without requiring shell access or any other control channel.
The libvirt API can be used to query the list of volumes in the storage pool or to get information regarding the capacity, allocation, and available storage in the storage pool. A storage volume in the storage pool may be queried to get information such as allocation and capacity, which may differ for sparse volumes.

Note

For more information about sparse volumes, see the Virtualization Getting Started Guide.
For storage pools that support it, the libvirt API can be used to create, clone, resize, and delete storage volumes. The APIs can also be used to upload data to storage volumes, download data from storage volumes, or wipe data from storage volumes.
Once a storage pool is started, a storage volume can be assigned to a guest using the storage pool name and storage volume name instead of the host path to the volume in the domain XML.

Note

For more information about the domain XML, see Chapter 23, Manipulating the Domain XML.
Storage pools can be stopped (destroyed). This removes the abstraction of the data, but keeps the data intact.
For example, an NFS server that uses mount -t nfs nfs.example.com:/path/to/share /path/to/data. A storage administrator responsible could define an NFS Storage Pool on the virtualization host to describe the exported server path and the client target path. This will allow libvirt to perform the mount either automatically when libvirt is started or as needed while libvirt is running. Files with the NFS Server exported directory are listed as storage volumes within the NFS storage pool.
When the storage volume is added to the guest, the administrator does not need to add the target path to the volume. He just needs to add the storage pool and storage volume by name. Therefore, if the target client path changes, it does not affect the virtual machine.
When the storage pool is started, libvirt mounts the share on the specified directory, just as if the system administrator logged in and executed mount nfs.example.com:/path/to/share /vmdata. If the storage pool is configured to autostart, libvirt ensures that the NFS shared disk is mounted on the directory specified when libvirt is started.
Once the storage pool is started, the files in the NFS shared disk are reported as storage volumes, and the storage volumes' paths may be queried using the libvirt API. The storage volumes' paths can then be copied into the section of a guest virtual machine's XML definition that describes the source storage for the guest virtual machine's block devices. In the case of NFS, an application that uses the libvirt API can create and delete storage volumes in the storage pool (files in the NFS share) up to the limit of the size of the pool (the storage capacity of the share).
Not all storage pool types support creating and deleting volumes. Stopping the storage pool (pool-destroy) undoes the start operation, in this case, unmounting the NFS share. The data on the share is not modified by the destroy operation, despite what the name of the command suggests. For more details, see man virsh.

Procedure 13.1. Creating and Assigning Storage

This procedure provides a high-level understanding of the steps needed to create and assign storage for virtual machine guests.
  1. Create storage pools

    Create one or more storage pools from available storage media. For more information, see Section 13.2, “Using Storage Pools”.
  2. Create storage volumes

    Create one or more storage volumes from the available storage pools. For more information, see Section 13.3, “Using Storage Volumes”.
  3. Assign storage devices to a virtual machine.

    Assign one or more storage devices abstracted from storage volumes to a guest virtual machine. For more information, see Section 13.3.6, “Adding Storage Devices to Guests”.

13.2. Using Storage Pools

This section provides information about using storage pools with virtual machines. It provides conceptual information, as well as detailed instructions on creating, configuring, and deleting storage pools using virsh commands and the Virtual Machine Manager.

13.2.1. Storage Pool Concepts

A storage pool is a file, directory, or storage device, managed by libvirt to provide storage to virtual machines. Storage pools are divided into storage volumes that store virtual machine images or are attached to virtual machines as additional storage. Multiple guests can share the same storage pool, allowing for better allocation of storage resources.
Storage pools can be either local or network-based (shared):
Local storage pools
Local storage pools are attached directly to the host server. They include local directories, directly attached disks, physical partitions, and Logical Volume Management (LVM) volume groups on local devices. Local storage pools are useful for development, testing, and small deployments that do not require migration or large numbers of virtual machines. Local storage pools may not be suitable for many production environments, because they cannot be used for live migration.
Networked (shared) storage pools
Networked storage pools include storage devices shared over a network using standard protocols. Networked storage is required when migrating virtual machines between hosts with virt-manager, but is optional when migrating with virsh.
For more information on migrating virtual machines, see Chapter 15, KVM Migration.
The following is a list of storage pool types supported by Red Hat Enterprise Linux:
  • Directory-based storage pools
  • Disk-based storage pools
  • Partition-based storage pools
  • GlusterFS storage pools
  • iSCSI-based storage pools
  • LVM-based storage pools
  • NFS-based storage pools
  • vHBA-based storage pools with SCSI devices
The following is a list of libvirt storage pool types that are not supported by Red Hat Enterprise Linux:
  • Multipath-based storage pool
  • RBD-based storage pool
  • Sheepdog-based storage pool
  • Vstorage-based storage pool
  • ZFS-based storage pool

Note

Some of the unsupported storage pool types appear in the Virtual Machine Manager interface. However, they should not be used.

13.2.2. Creating Storage Pools

This section provides general instructions for creating storage pools using virsh and the Virtual Machine Manager. Using virsh enables you to specify all parameters, whereas using Virtual Machine Manager provides a graphic method for creating simpler storage pools.

13.2.2.1. Creating Storage Pools with virsh

Note

This section shows the creation of a partition-based storage pool as an example.

Procedure 13.2. Creating Storage Pools with virsh

  1. Read recommendations and ensure that all prerequisites are met

    For some storage pools, this guide recommends that you follow certain practices. In addition, there are prerequisites for some types of storage pools. To see the recommendations and prerequisites, if any, see Section 13.2.3, “Storage Pool Specifics”.
  2. Define the storage pool

    Storage pools can be persistent or transient. A persistent storage pool survives a system restart of the host machine. A transient storage pool only exists until the host reboots.
    Do one of the following:
    • Define the storage pool using an XML file.
      a. Create a temporary XML file containing the storage pool information required for the new device.
      The XML file must contain specific fields, based on the storage pool type. For more information, see Section 13.2.3, “Storage Pool Specifics”.
      The following shows an example a storage pool definition XML file. In this example, the file is saved to ~/guest_images.xml
      <pool type='fs'>
        <name>guest_images_fs</name>
        <source>
          <device path='/dev/sdc1'/>
        </source>
        <target>
          <path>/guest_images</path>
        </target>
      </pool> 
      b. Use the virsh pool-define command to create a persistent storage pool or the virsh pool-create command to create and start a transient storage pool.
      # virsh pool-define ~/guest_images.xml
      Pool defined from guest_images_fs
      
      or
      # virsh pool-create ~/guest_images.xml
      Pool created from guest_images_fs
      c. Delete the XML file created in step a.
    • Use the virsh pool-define-as command to create a persistent storage pool or the virsh pool-create-as command to create a transient storage pool.
      The following examples create a persistent and then a transient filesystem-based storage pool mapped to /dev/sdc1 from the /guest_images directory.
      # virsh pool-define-as guest_images_fs fs - - /dev/sdc1 - "/guest_images"
      Pool guest_images_fs defined
      or
      # virsh pool-create-as guest_images_fs fs - - /dev/sdc1 - "/guest_images"
      Pool guest_images_fs created

      Note

      When using the virsh interface, option names in the commands are optional. If option names are not used, use dashes for fields that do not need to be specified.
  3. Verify that the pool was created

    List all existing storage pools using the virsh pool-list --all.
    # virsh pool-list --all
    Name                 State      Autostart
    -----------------------------------------
    default              active     yes
    guest_images_fs      inactive   no
    
  4. Define the storage pool target path

    Use the virsh pool-build command to create a storage pool target path for a pre-formatted file system storage pool, initialize the storage source device, and define the format of the data. Then use the virsh pool-list command to ensure that the storage pool is listed.
    # virsh pool-build guest_images_fs
    Pool guest_images_fs built
    # ls -la /guest_images
    total 8
    drwx------.  2 root root 4096 May 31 19:38 .
    dr-xr-xr-x. 25 root root 4096 May 31 19:38 ..
    # virsh pool-list --all
    Name                 State      Autostart
    -----------------------------------------
    default              active     yes
    guest_images_fs      inactive   no
    

    Note

    Building the target path is only necessary for disk-based, file system-based, and logical storage pools. If libvirt detects that the source storage device's data format differs from the selected storage pool type, the build fails, unless the overwrite option is specified.
  5. Start the storage pool

    Use the virsh pool-start command to prepare the source device for usage.
    The action taken depends on the storage pool type. For example, for a file system-based storage pool, the virsh pool-start command mounts the file system. For an LVM-based storage pool, the virsh pool-start command activates the volume group usng the vgchange command.
    Then use the virsh pool-list command to ensure that the storage pool is active.
    # virsh pool-start guest_images_fs
    Pool guest_images_fs started
    # virsh pool-list --all
    Name                 State      Autostart
    -----------------------------------------
    default              active     yes
    guest_images_fs      active     no
    

    Note

    The virsh pool-start command is only necessary for persistent storage pools. Transient storage pools are automatically started when they are created.
  6. Turn on autostart (optional)

    By default, a storage pool defined with virsh is not set to automatically start each time libvirtd starts. You can configure the storage pool to start automatically using the virsh pool-autostart command.
    # virsh pool-autostart guest_images_fs
    Pool guest_images_fs marked as autostarted
    
    # virsh pool-list --all
    Name                 State      Autostart
    -----------------------------------------
    default              active     yes
    guest_images_fs      active     yes
    
    The storage pool is now automatically started each time libvirtd starts.
  7. Verify the storage pool

    Verify that the storage pool was created correctly, the sizes reported are as expected, and the state is reported as running. Verify there is a "lost+found" directory in the target path on the file system, indicating that the device is mounted.
    # virsh pool-info guest_images_fs
    Name:           guest_images_fs
    UUID:           c7466869-e82a-a66c-2187-dc9d6f0877d0
    State:          running
    Persistent:     yes
    Autostart:      yes
    Capacity:       458.39 GB
    Allocation:     197.91 MB
    Available:      458.20 GB
    # mount | grep /guest_images
    /dev/sdc1 on /guest_images type ext4 (rw)
    # ls -la /guest_images
    total 24
    drwxr-xr-x.  3 root root  4096 May 31 19:47 .
    dr-xr-xr-x. 25 root root  4096 May 31 19:38 ..
    drwx------.  2 root root 16384 May 31 14:18 lost+found
    

13.2.2.2. Creating storage pools with Virtual Machine Manager

Note

This section shows the creation of a disk-based storage pool as an example.

Procedure 13.3. Creating Storage Pools with Virtual Machine Manager

  1. Prepare the medium on which the storage pool will be created

    This will differ for different types of storage pools. For details, see Section 13.2.3, “Storage Pool Specifics”.
    In this example, you may need to relabel the disk with a GUID Partition Table.
  2. Open the storage settings

    1. In Virtual Machine Manager, select the host connection you want to configure.
      Open the Edit menu and select Connection Details.
    2. Click the Storage tab in the Connection Details window.
      Storage tab

      Figure 13.1. Storage tab

  3. Create a new storage pool

    Note

    Using Virtual Machine Manager, you can only create persistent storage pools. Transient storage pools can only be created using virsh.
    1. Add a new storage pool (part 1)

      Click the button at the bottom of the window. The Add a New Storage Pool wizard appears.
      Enter a Name for the storage pool. This example uses the name guest_images_fs.
      Select a storage pool type to create from the Type drop-down list. This example uses fs: Pre-Formatted Block Device.
      Storage pool name and type

      Figure 13.2. Storage pool name and type

      Click the Forward button to continue.
    2. Add a new pool (part 2)

      Storage pool path

      Figure 13.3. Storage pool path

      Configure the storage pool with the relevant parameters. For information on the parameters for each type of storage pool, see Section 13.2.3, “Storage Pool Specifics”.
      For some types of storage pools, a Build Pool check box appears in the dialog. If you want to build the storage pool from the storage, check the Build Pool check box.
      Verify the details and click the Finish button to create the storage pool.

13.2.3. Storage Pool Specifics

This section provides information specific to each type of storage pool, including prerequisites, parameters, and additional information. It includes the following topics:

13.2.3.1. Directory-based storage pools

Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating a directory-based storage pool.

Table 13.1. Directory-based storage pool parameters

DescriptionXMLpool-define-asVirtual Machine Manager
The type of storage pool <pool type='dir'> [type] directory dir: Filesystem Directory
The name of the storage pool <name>name</name> [name] name Name
The path specifying the target. This will be the path used for the storage pool.

<target>
  <path>target_path</path>
</target>

target path_to_pool Target Path
If you are using virsh to create the storage pool, continue by verifying that the pool was created.
Examples
The following is an example of an XML file for a storage pool based on the /guest_images directory:
<pool type='dir'>
  <name>dirpool</name>
  <target>
    <path>/guest_images</path>
  </target>
</pool>  
The following is an example of a command for creating a storage pool based on the /guest_images directory:
# virsh pool-define-as dirpool dir --target "/guest_images"
Pool FS_directory defined
The following images show an example of the Virtual Machine Manager Add a New Storage Pool dialog boxes for creating a storage pool based on the /guest_images directory:
Add a new directory-based storage pool example

Figure 13.4. Add a new directory-based storage pool example

13.2.3.2. Disk-based storage pools

Recommendations
Be aware of the following before creating a disk-based storage pool:
  • Depending on the version of libvirt being used, dedicating a disk to a storage pool may reformat and erase all data currently stored on the disk device. It is strongly recommended that you back up the data on the storage device before creating a storage pool.
  • Guests should not be given write access to whole disks or block devices (for example, /dev/sdb). Use partitions (for example, /dev/sdb1) or LVM volumes.
    If you pass an entire block device to the guest, the guest will likely partition it or create its own LVM groups on it. This can cause the host physical machine to detect these partitions or LVM groups and cause errors.
Prerequisites

Note

The steps in this section are only required if you do not run the virsh pool-build command.
Before a disk-based storage pool can be created on a host disk, the disk must be relabeled with a GUID Partition Table (GPT) disk label. GPT disk labels allow for creating up to 128 partitions on each device.
# parted /dev/sdb
GNU Parted 2.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel
New disk label type? gpt
(parted) quit
Information: You may need to update /etc/fstab.
#
After relabeling the disk, continue creating the storage pool with defining the storage pool.
Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating a disk-based storage pool.

Table 13.2. Disk-based storage pool parameters

DescriptionXMLpool-define-asVirtual Machine Manager
The type of storage pool <pool type='disk'> [type] disk disk: Physical Disk Device
The name of the storage pool <name>name</name> [name] name Name
The path specifying the storage device. For example, /dev/sdb

<source>
  <device path=/dev/sdb/>
<source>

source-dev path_to_disk Source Path
The path specifying the target. This will be the path used for the storage pool.

<target>
  <path>/path_to_pool</path>
</target>

target path_to_pool Target Path
If you are using virsh to create the storage pool, continue with defining the storage pool.
Examples
The following is an example of an XML file for a disk-based storage pool:
<pool type='disk'>
  <name>phy_disk</name>
  <source>
    <device path='/dev/sdb'/>
    <format type='gpt'/>
  </source>
  <target>
    <path>/dev</path>
  </target>
</pool>  
The following is an example of a command for creating a disk-based storage pool:
# virsh pool-define-as phy_disk disk gpt --source-dev=/dev/sdb --target /dev
Pool phy_disk defined
The following images show an example of the the virtual machine XML configurationVirtual Machine Manager Add a New Storage Pool dialog boxes for creating a disk-based storage pool:
Add a new disk-based storage pool example

Figure 13.5. Add a new disk-based storage pool example

13.2.3.3. Filesystem-based storage pools

Recommendations
Do not use the procedures in this section to assign an entire disk as a storage pool (for example, /dev/sdb). Guests should not be given write access to whole disks or block devices. This method should only be used to assign partitions (for example, /dev/sdb1) to storage pools.
Prerequisites

Note

This is only required if you do not run the virsh pool-build command.
To create a storage pool from a partition, format the file system to ext4.
# mkfs.ext4 /dev/sdc1
After formatting the file system, continue creating the storage pool with defining the storage pool.
Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating a filesystem-based storage pool from a partition.

Table 13.3. Filesystem-based storage pool parameters

DescriptionXMLpool-define-asVirtual Machine Manager
The type of storage pool <pool type='fs'> [type] fs fs: Pre-Formatted Block Device
The name of the storage pool <name>name</name> [name] name Name
The path specifying the partition. For example, /dev/sdc1

<source>
  <device path='source_path' />

[source] path_to_partition Source Path
The filesystem type, for example ext4

  <format type='fs_type' />
</source>

[source format] FS-format N/A
The path specifying the target. This will be the path used for the storage pool.

<target>
  <path>/path_to_pool</path>
</target>

[target] path_to_pool Target Path
If you are using virsh to create the storage pool, continue with verifying that the storage pool was created.
Examples
The following is an example of an XML file for a filesystem-based storage pool:
<pool type='fs'>
  <name>guest_images_fs</name>
  <source>
    <device path='/dev/sdc1'/>
    <format type='auto'/>
  </source>
  <target>
    <path>/guest_images</path>
  </target>
</pool>  
The following is an example of a command for creating a partition-based storage pool:
# virsh pool-define-as guest_images_fs fs --source-dev /dev/sdc1 --target /guest_images
Pool guest_images_fs defined
The following images show an example of the the virtual machine XML configurationVirtual Machine Manager Add a New Storage Pool dialog boxes for creating a filesystem-based storage pool:
Add a new filesystem-based storage pool example

Figure 13.6. Add a new filesystem-based storage pool example

13.2.3.4. GlusterFS-based storage pools

Recommendations
GlusterFS is a user space file system that uses File System in Userspace (FUSE).
Prerequisites
Before a GlusterFS-based storage pool can be created on a host, a Gluster server must be prepared.

Procedure 13.4. Preparing a Gluster server

  1. Obtain the IP address of the Gluster server by listing its status with the following command:
    # gluster volume status
    Status of volume: gluster-vol1
    Gluster process						Port	Online	Pid
    ------------------------------------------------------------------------------
    Brick 222.111.222.111:/gluster-vol1 			49155	Y	18634
    
    Task Status of Volume gluster-vol1
    ------------------------------------------------------------------------------
    There are no active volume tasks
    
  2. If not installed, install the glusterfs-fuse package.
  3. If not enabled, enable the virt_use_fusefs boolean. Check that it is enabled.
    # setsebool virt_use_fusefs on
    # getsebool virt_use_fusefs
    virt_use_fusefs --> on
    
After ensuring that the required packages are installed and enabled, continue creating the storage pool continue creating the storage pool with defining the storage pool.
Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating a GlusterFS-based storage pool.

Table 13.4. GlusterFS-based storage pool parameters

DescriptionXMLpool-define-asVirtual Machine Manager
The type of storage pool <pool type='gluster'> [type] gluster Gluster: Gluster Filesystem
The name of the storage pool <name>name</name> [name] name Name
The hostname or IP address of the Gluster server

<source>
  <hostname='hostname' />

source-host hostname Host Name
The name of the Gluster server   <name='Gluster-name' /> source-name Gluster-name Source Name
The path on the Gluster server used for the storage pool

  <dir path='Gluster-path' />
</source>

source-path Gluster-path Source Path
If you are using virsh to create the storage pool, continue with verifying that the storage pool was created.
Examples
The following is an example of an XML file for a GlusterFS-based storage pool:
<pool type='gluster'>
  <name>Gluster_pool</name>
  <source>
    <host name='111.222.111.222'/>
    <dir path='/'/>
    <name>gluster-vol1</name>
  </source>
</pool>  
The following is an example of a command for creating a GlusterFS-based storage pool:
# pool-define-as --name Gluster_pool --type gluster --source-host 111.222.111.222 --source-name gluster-vol1 --source-path /
Pool Gluster_pool defined
The following images show an example of the the virtual machine XML configurationVirtual Machine Manager Add a New Storage Pool dialog boxes for creating a GlusterFS-based storage pool:
Add a new GlusterFS-based storage pool example

Figure 13.7. Add a new GlusterFS-based storage pool example

13.2.3.5. iSCSI-based storage pools

Recommendations
Internet Small Computer System Interface (iSCSI) is a network protocol for sharing storage devices. iSCSI connects initiators (storage clients) to targets (storage servers) using SCSI instructions over the IP layer.
Using iSCSI-based devices to store guest virtual machines allows for more flexible storage options, such as using iSCSI as a block storage device. The iSCSI devices use a Linux-IO (LIO) target. This is a multi-protocol SCSI target for Linux. In addition to iSCSI, LIO also supports Fibre Channel and Fibre Channel over Ethernet (FCoE).
Prerequisites
Before an iSCSI-based storage pool can be created, iSCSI targets must be created. iSCSI targets are created with the targetcli package, which provides a command set for creating software-backed iSCSI targets.

Procedure 13.5. Creating an iSCSI target

  1. Install the targetcli package

    # yum install targetcli
  2. Launch the targetcli command set

    # targetcli
  3. Create storage objects

    Create three storage objects, using a storage pool.
    1. Create a block storage object
      1. Navigate to the /backstores/block directory.
      2. Run the create command.
        # create [block-name][filepath]
        For example:
        # create block1 dev=/dev/sdb1
    2. Create a fileio object
      1. Navigate to the /fileio directory.
      2. Run the create command.
        # create [fileio-name][image-name] [image-size]
        For example:
        # create fileio1 /foo.img 50M
    3. Create a ramdisk object
      1. Navigate to the /ramdisk directory.
      2. Run the create command.
        # create [ramdisk-name] [ramdisk-size]
        For example:
        # create ramdisk1 1M
    4. Make note of the names of the disks created in this step. They will be used later.
  4. Create an iSCSI target

    1. Navigate to the /iscsi directory.
    2. Create the target in one of two ways:
      • Run the create command with no parameters.
        The iSCSI qualified name (IQN) is generated automatically.
      • Run the create command specifying the IQN and the server. For example:
        # create iqn.2010-05.com.example.server1:iscsirhel7guest
  5. Define the portal IP address

    To export the block storage over iSCSI, the portal, LUNs, and access control lists ACLs must first be configured.
    The portal includes the IP address and TCP that the target monitors, and the initiators to which it connects. iSCSI uses port 3260. This port is configured by default.
    To connect to port 3260:
    1. Navigate to the /tpg directory.
    2. Run the following:
      # portals/ create
      This command makes all available IP addresses listening to port 2360.
      If you want only a single IP address to listen to port 3260, add the IP address to the end of the command. For example:
      # portals/ create 143.22.16.33
  6. Configure the LUNs and assign storage objects to the fabric

    This step uses the storage objects created in creating storage objects.
    1. Navigate to the luns directory for the TPG created in defining the portal IP address. For example:
      # iscsi>iqn.iqn.2010-05.com.example.server1:iscsirhel7guest
    2. Assign the first LUN to the ramdisk. For example:
      # create /backstores/ramdisk/ramdisk1
    3. Assign the second LUN to the block disk. For example:
      # create /backstores/block/block1
    4. Assign the third LUN to the fileio disk. For example:
      # create /backstores/fileio/fileio1
    5. List the resulting LUNs.
      /iscsi/iqn.20...csirhel7guest ls
      
      o- tgp1 ............................................................[enabled, auth]
        o- acls...................................................................[0 ACL]
        o- luns..................................................................[3 LUNs]
        | o- lun0......................................................[ramdisk/ramdisk1]
        | o- lun1...............................................[block/block1 (dev/vdb1)]
        | o- lun2................................................[fileio/file1 (foo.img)]
        o- portals.............................................................[1 Portal]
          o- IP-ADDRESS:3260.........................................................[OK]
      
  7. Create ACLs for each initiator

    Enable authentication when the initiator connects. You can also resrict specified LUNs to specified intiators. Targets and initiators have unique names. iSCSI initiators use IQNs.
    1. Find the IQN of the iSCSI initiator, using the initiator name. For example:
      # cat /etc/iscsi/initiator2.iscsi
      InitiatorName=create iqn.2010-05.com.example.server1:iscsirhel7guest
      This IQN is used to create the ACLs.
    2. Navigate to the acls directory.
    3. Create ACLs by doing one of the following:
      • Create ACLS for all LUNs and initiators by running the create command with no parameters.
        # create
      • Create an ACL for a specific LUN and initiator, run the create command specifying the IQN of the iSCSI intiator. For example:
        # create iqn.2010-05.com.example.server1:888
      • Configure the kernel target to use a single user ID and password for all initiators.
        # set auth userid=user_ID
        # set auth password=password
        # set attribute authentication=1
        # set attribute generate_node_acls=1
    After completing this procedure, continue by securing the storage pool.
  8. Save the configuration

    Make the configuration persistent by overwriting the previous boot settings.
    # saveconfig
  9. Enable the service

    To apply the saved settings on the next boot, enable the service.
    # systemctl enable target.service
Optional procedures
There are a number of optional procedures that you can perform with the iSCSI targets before creating the iSCSI-based storage pool.

Procedure 13.6. Configuring a logical volume on a RAID array

  1. Create a RAID5 array

    For information on creating a RAID5 array, see the Red Hat Enterprise Linux 7 Storage Administration Guide.
  2. Create an LVM logical volume on the RAID5 array

    For information on creating an LVM logical volume on a RAID5 array, see the Red Hat Enterprise Linux 7 Logical Volume Manager Administration Guide.

Procedure 13.7. Testing discoverability

  • Ensure that the new iSCSI device is discoverable.

    # iscsiadm --mode discovery --type sendtargets --portal server1.example.com
    143.22.16.33:3260,1 iqn.2010-05.com.example.server1:iscsirhel7guest

Procedure 13.8. Testing device attachment

  1. Attach the new iSCSI device

    Attach the new device (iqn.2010-05.com.example.server1:iscsirhel7guest) to determine whether the device can be attached.
    # iscsiadm -d2 -m node --login
    scsiadm: Max file limits 1024 1024
    
    Logging in to [iface: default, target: iqn.2010-05.com.example.server1:iscsirhel7guest, portal: 143.22.16.33,3260]
    Login to [iface: default, target: iqn.2010-05.com.example.server1:iscsirhel7guest, portal: 143.22.16.33,3260] successful.
    
  2. Detach the device

    # iscsiadm -d2 -m node --logout
    scsiadm: Max file limits 1024 1024
    
    Logging out of session [sid: 2, target: iqn.2010-05.com.example.server1:iscsirhel7guest, portal: 143.22.16.33,3260
    Logout of [sid: 2, target: iqn.2010-05.com.example.server1:iscsirhel7guest, portal: 143.22.16.33,3260] successful.

Procedure 13.9. Using libvirt secrets for an iSCSI storage pool

Note

This procedure is required if a user_ID and password were defined when creating an iSCSI target.
User name and password parameters can be configured with virsh to secure an iSCSI storage pool. This can be configured before or after the pool is defined, but the pool must be started for the authentication settings to take effect.
  1. Create a libvirt secret file

    Create a libvirt secret file with a challenge-handshake authentication protocol (CHAP) user name. For example:
    <secret ephemeral='no' private='yes'>
        <description>Passphrase for the iSCSI example.com server</description>
        <usage type='iscsi'>
            <target>iscsirhel7secret</target>
        </usage>
    </secret>    
  2. Define the secret

    # virsh secret-define secret.xml
  3. Verify the UUID

    # virsh secret-list
    UUID                                  Usage
    --------------------------------------------------------------------------------
    2d7891af-20be-4e5e-af83-190e8a922360  iscsi iscsirhel7secret
  4. Assign a secret to the UID

    Use the following commands to assign a secret to the UUID in the output of the previous step. This ensures that the CHAP username and password are in a libvirt-controlled secret list.
    # MYSECRET=`printf %s "password123" | base64`
    # virsh secret-set-value 2d7891af-20be-4e5e-af83-190e8a922360 $MYSECRET
  5. Add an authentication entry to the storage pool

    Modify the <source> entry in the storage pool's XML file using virsh edit, and add an <auth> element, specifying authentication type, username, and secret usage.
    For example:
    <pool type='iscsi'>
      <name>iscsirhel7pool</name>
        <source>
           <host name='192.168.122.1'/>
           <device path='iqn.2010-05.com.example.server1:iscsirhel7guest'/>
           <auth type='chap' username='redhat'>
              <secret usage='iscsirhel7secret'/>
           </auth>
        </source>
      <target>
        <path>/dev/disk/by-path</path>
      </target>
    </pool>     

    Note

    The <auth> sub-element exists in different locations within the guest XML's <pool> and <disk> elements. For a <pool>, <auth> is specified within the <source> element, as this describes where to find the pool sources, since authentication is a property of some pool sources (iSCSI and RBD). For a <disk>, which is a sub-element of a domain, the authentication to the iSCSI or RBD disk is a property of the disk.
    In addition, the <auth> sub-element for a disk differs from that of a storage pool.
    <auth username='redhat'>
      <secret type='iscsi' usage='iscsirhel7secret'/>
    </auth>  
  6. Activate the changes

    The storage pool must be started to activate these changes.
    • If the storage pool has not yet been started, follow the steps in Creating Storage Pools with virsh to define and start the storage pool.
    • If the pool has already been started, enter the following commands to stop and restart the storage pool:
      # virsh pool-destroy iscsirhel7pool
      # virsh pool-start iscsirhel7pool
Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating an iSCSI-based storage pool.

Table 13.5. iSCSI-based storage pool parameters

DescriptionXMLpool-define-asVirtual Machine Manager
The type of storage pool <pool type='iscsi'> [type] iscsi iscsi: iSCSI Target
The name of the storage pool <name>name</name> [name] name Name
The name of the host.

<source>
  <host name='hostname' />

source-host hostname Host Name
The iSCSI IQN.

  device path="iSCSI_IQN" />
</source>

source-dev iSCSI_IQN Source IQN
The path specifying the target. This will be the path used for the storage pool.

<target>
  <path>/dev/disk/by-path</path>
</target>

target path_to_pool Target Path
(Optional) The IQN of the iSCSI initiator. This is only needed when the ACL restricts the LUN to a particular initiator.

<initiator>
  <iqn name='initiator0' />
</initiator>

See the note below. Initiator IQN

Note

The IQN of the iSCSI initiator can be determined using the virsh find-storage-pool-sources-as iscsi command.
If you are using virsh to create the storage pool, continue with verifying that the storage pool was created.
Examples
The following is an example of an XML file for an iSCSI-based storage pool:
<pool type='iscsi'>
  <name>iSCSI_pool</name>
  <source>
    <host name='server1.example.com'/>
    <device path='iqn.2010-05.com.example.server1:iscsirhel7guest'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool>
  
The following is an example of a command for creating an iSCSI-based storage pool:
# virsh pool-define-as --name iSCSI_pool --type iscsi --source-host server1.example.com --source-dev iqn.2010-05.com.example.server1:iscsirhel7guest --target /dev/disk/by-path
Pool iSCSI_pool defined
The following images show an example of the the virtual machine XML configurationVirtual Machine Manager Add a New Storage Pool dialog boxes for creating an iSCSI-based storage pool:
Add a new iSCSI-based storage pool example

Figure 13.8. Add a new iSCSI-based storage pool example

13.2.3.6. LVM-based storage pools

Recommendations
Be aware of the following before creating an LVM-based storage pool:
  • LVM-based storage pools do not provide the full flexibility of LVM.
  • libvirt supports thin logical volumes, but does not provide the features of thin storage pools.
  • LVM-based storage pools are volume groups. You can create volume groups using Logical Volume Manager commands or virsh commands. To manage volume groups using the virsh interface, use the virsh commands to create volume groups.
    For more information about volume groups, see the Red Hat Enterprise Linux Logical Volume Manager Administration Guide.
  • LVM-based storage pools require a full disk partition. If activating a new partition or device with these procedures, the partition will be formatted and all data will be erased. If using the host's existing Volume Group (VG) nothing will be erased. It is recommended to back up the storage device before commencing the following procedure.
    For information on creating LVM volume groups, see the Red Hat Enterprise Linux Logical Volume Manager Administration Guide.
  • If you create an LVM-based storage pool on an existing VG, you should not run the pool-build command.
After ensuring that the VG is prepared, continue creating the storage pool with defining the storage pool.
Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating an LVM-based storage pool.

Table 13.6. LVM-based storage pool parameters

DescriptionXMLpool-define-asVirtual Machine Manager
The type of storage pool <pool type='logical'> [type] logical logical: LVM Volume Group
The name of the storage pool <name>name</name> [name] name Name
The path to the device for the storage pool

<source>
  <device path='device_path' />

source-dev device_path Source Path
The name of the volume group   <name='VG-name' /> source-name VG-name Source Path
The virtual group format

  <format type='lvm2' />
</source>

source-format lvm2 N/A
The target path

<target>
  <path='target-path' />
</target>

target target-path Target Path

Note

If the logical volume group is made of multiple disk partitions, there may be multiple source devices listed. For example:
<source>
  <device path='/dev/sda1'/>
  <device path='/dev/sdb3'/>
  <device path='/dev/sdc2'/>
  ...
  </source> 
If you are using virsh to create the storage pool, continue with verifying that the storage pool was created.
Examples
The following is an example of an XML file for an LVM-based storage pool:
<pool type='logical'>
  <name>guest_images_lvm</name>
  <source>
    <device path='/dev/sdc'/>
    <name>libvirt_lvm</name>
    <format type='lvm2'/>
  </source>
  <target>
    <path>/dev/libvirt_lvm</path>
  </target>
</pool>  
The following is an example of a command for creating an LVM-based storage pool:
# virsh pool-define-as guest_images_lvm logical --source-dev=/dev/sdc --source-name libvirt_lvm --target /dev/libvirt_lvm
Pool guest_images_lvm defined
The following images show an example of the the virtual machine XML configurationVirtual Machine Manager Add a New Storage Pool dialog boxes for creating an LVM-based storage pool:
Add a new LVM-based storage pool example

Figure 13.9. Add a new LVM-based storage pool example

13.2.3.7. NFS-based storage pools

Prerequisites
To create an Network File System (NFS)-based storage pool, an NFS Server should already be configured to be used by the host machine. For more information about NFS, see the Red Hat Enterprise Linux Storage Administration Guide.
After ensuring that the NFS Server is properly configured, continue creating the storage pool with defining the storage pool.
Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating an NFS-based storage pool.

Table 13.7. NFS-based storage pool parameters

DescriptionXMLpool-define-asVirtual Machine Manager
The type of storage pool <pool type='netfs'> [type] netfs netfs: Network Exported Directory
The name of the storage pool <name>name</name> [name] name Name
The hostname of the NFS server where the mount point is located. This can be a hostname or an IP address.

<source>
  <host name='host_name' />

source-host host_name Host Name
The directory used on the NFS server

  <dir path='source_path' />
</source>

source-path source_path Source Path
The path specifying the target. This will be the path used for the storage pool.

<target>
  <path>/target_path</path>
</target>

target target_path Target Path
If you are using virsh to create the storage pool, continue with verifying that the storage pool was created.
Examples
The following is an example of an XML file for an NFS-based storage pool:
<pool type='netfs'>
  <name>nfspool</name>
  <source>
    <host name='localhost'/>
    <dir path='/home/net_mount'/>
  </source>
  <target>
    <path>/var/lib/libvirt/images/nfspool</path>
  </target>
</pool>  
The following is an example of a command for creating an NFS-based storage pool:
# virsh pool-define-as nfspool netfs --source-host localhost --source-path /home/net_mount --target /var/lib/libvirt/images/nfspool
Pool nfspool defined
The following images show an example of the the virtual machine XML configurationVirtual Machine Manager Add a New Storage Pool dialog boxes for creating an NFS-based storage pool:
Add a new NFS-based storage pool example

Figure 13.10. Add a new NFS-based storage pool example

13.2.3.8. vHBA-based storage pools using SCSI devices

Note

You cannot use Virtual Machine Manager to create vHBA-based storage pools using SCSI devices.
Recommendations
N_Port ID Virtualization (NPIV) is a software technology that allows sharing of a single physical Fibre Channel host bus adapter (HBA). This allows multiple guests to see the same storage from multiple physical hosts, and thus allows for easier migration paths for the storage. As a result, there is no need for the migration to create or copy storage, as long as the correct storage path is specified.
In virtualization, the virtual host bus adapter, or vHBA, controls the Logical Unit Numbers (LUNs) for virtual machines. For a host to share one Fibre Channel device path between multiple KVM guests, a vHBA must be created for each virtual machine. A single vHBA must not be used by multiple KVM guests.
Each vHBA for NPIV is identified by its parent HBA and its own World Wide Node Name (WWNN) and World Wide Port Name (WWPN). The path to the storage is determined by the WWNN and WWPN values. The parent HBA can be defined as scsi_host# or as a WWNN/WWPN pair.

Note

If a parent HBA is defined as scsi_host# and hardware is added to the host machine, the scsi_host# assignment may change. Therefore, it is recommended that you define a parent HBA using a WWNN/WWPN pair.
It is recommended that you define a libvirt storage pool based on the vHBA, because this preserves the vHBA configuration.
Using a libvirt storage pool has two primary advantages:
  • The libvirt code can easily find the LUN's path via virsh command output.
  • Virtual machine migration requires only defining and starting a storage pool with the same vHBA name on the target machine. To do this, the vHBA LUN, libvirt storage pool and volume name must be specified in the virtual machine's XML configuration. Refer to Section 13.2.3.8, “vHBA-based storage pools using SCSI devices” for an example.

Note

Before creating a vHBA, it is recommended that you configure storage array (SAN)-side zoning in the host LUN to provide isolation between guests and prevent the possibility of data corruption.
To create a persistent vHBA configuration, first create a libvirt 'scsi' storage pool XML file using the format below. When creating a single vHBA that uses a storage pool on the same physical HBA, it is recommended to use a stable location for the <path> value, such as one of the /dev/disk/by-{path|id|uuid|label} locations on your system.
When creating multiple vHBAs that use storage pools on the same physical HBA, the value of the <path> field must be only /dev/, otherwise storage pool volumes are visible only to one of the vHBAs, and devices from the host cannot be exposed to multiple guests with the NPIV configuration.
For more information on <path> and the elements in <target>, see upstream libvirt documentation.
Prerequisites
Before creating a vHBA-based storage pools with SCSI devices, create a vHBA:

Procedure 13.10. Creating a vHBA

  1. Locate HBAs on the host system

    To locate the HBAs on your host system, use the virsh nodedev-list --cap vports command.
    The following example shows a host that has two HBAs that support vHBA:
    # virsh nodedev-list --cap vports
    scsi_host3
    scsi_host4
    
  2. Check the HBA's details

    Use the virsh nodedev-dumpxml HBA_device command to see the HBA's details.
    # virsh nodedev-dumpxml scsi_host3
    The output from the command lists the <name>, <wwnn>, and <wwpn> fields, which are used to create a vHBA. <max_vports> shows the maximum number of supported vHBAs. For example:
    <device>
      <name>scsi_host3</name>
      <path>/sys/devices/pci0000:00/0000:00:04.0/0000:10:00.0/host3</path>
      <parent>pci_0000_10_00_0</parent>
      <capability type='scsi_host'>
        <host>3</host>
        <unique_id>0</unique_id>
        <capability type='fc_host'>
          <wwnn>20000000c9848140</wwnn>
          <wwpn>10000000c9848140</wwpn>
          <fabric_wwn>2002000573de9a81</fabric_wwn>
        </capability>
        <capability type='vport_ops'>
          <max_vports>127</max_vports>
          <vports>0</vports>
        </capability>
      </capability>
    </device>   
    In this example, the <max_vports> value shows there are a total 127 virtual ports available for use in the HBA configuration. The <vports> value shows the number of virtual ports currently being used. These values update after creating a vHBA.
  3. Create a vHBA host device

    Create an XML file similar to one of the following for the vHBA host. In this examples, the file is named vhba_host3.xml.
    This example uses scsi_host3 to describe the parent vHBA.
    # cat vhba_host3.xml
    <device>
      <parent>scsi_host3</parent>
      <capability type='scsi_host'>
        <capability type='fc_host'>
        </capability>
      </capability>
    </device>   
    This example uses a WWNN/WWPN pair to describe the parent vHBA.
    # cat vhba_host3.xml
    <device>
      <name>vhba</name>
      <parent wwnn='20000000c9848140' wwpn='10000000c9848140'/>
      <capability type='scsi_host'>
        <capability type='fc_host'>
        </capability>
      </capability>
    </device>   

    Note

    The WWNN and WWPN values must match those in the HBA details seen in Procedure 13.10, “Creating a vHBA”.
    The <parent> field specifies the HBA device to associate with this vHBA device. The details in the <device> tag are used in the next step to create a new vHBA device for the host. For more information on the nodedev XML format, see the libvirt upstream pages.
  4. Create a new vHBA on the vHBA host device

    To create a vHBA on the basis of vhba_host3, use the virsh nodedev-create command:
    # virsh nodedev-create vhba_host3.xml
    Node device scsi_host5 created from vhba_host3.xml
  5. Verify the vHBA

    Verify the new vHBA's details (scsi_host5) with the virsh nodedev-dumpxml command:
    # virsh nodedev-dumpxml scsi_host5
    <device>
      <name>scsi_host5</name>
      <path>/sys/devices/pci0000:00/0000:00:04.0/0000:10:00.0/host3/vport-3:0-0/host5</path>
      <parent>scsi_host3</parent>
      <capability type='scsi_host'>
        <host>5</host>
        <unique_id>2</unique_id>
        <capability type='fc_host'>
          <wwnn>5001a4a93526d0a1</wwnn>
          <wwpn>5001a4ace3ee047d</wwpn>
          <fabric_wwn>2002000573de9a81</fabric_wwn>
        </capability>
      </capability>
    </device>  
After verifying the vHBA, continue creating the storage pool with defining the storage pool.
Parameters
The following table provides a list of required parameters for the XML file, the virsh pool-define-as command, and the Virtual Machine Manager application, for creating a vHBA-based storage pool.

Table 13.8. vHBA-based storage pool parameters

DescriptionXMLpool-define-as
The type of storage pool <pool type='scsi'> scsi
The name of the storage pool <name>name</name> --adapter-name name
The identifier of the vHBA. The parent attribute is optional.

<source>
  <adapter type='fc_host'
  [parent=parent_scsi_device]
  wwnn='WWNN'
  wwpn='WWPN' />
</source>

[--adapter-parent parent]
--adapter-wwnn wwnn
--adapter-wpnn wwpn

The path specifying the target. This will be the path used for the storage pool.

<target>
  <path>target_path</path>
</target>

target path_to_pool

Important

When the <path> field is /dev/, libvirt generates a unique short device path for the volume device path. For example, /dev/sdc. Otherwise, the physical host path is used. For example, /dev/disk/by-path/pci-0000:10:00.0-fc-0x5006016044602198-lun-0. The unique short device path allows the same volume to be listed in multiple guests by multiple storage pools. If the physical host path is used by multiple guests, duplicate device type warnings may occur.

Note

The parent attribute can be used in the <adapter> field to identify the physical HBA parent from which the NPIV LUNs by varying paths can be used. This field, scsi_hostN, is combined with the vports and max_vports attributes to complete the parent identification. The parent, parent_wwnn, parent_wwpn, or parent_fabric_wwn attributes provide varying degrees of assurance that after the host reboots the same HBA is used.
  • If no parent is specified, libvirt uses the first scsi_hostN adapter that supports NPIV.
  • If only the parent is specified, problems can arise if additional SCSI host adapters are added to the configuration.
  • If parent_wwnn or parent_wwpn is specified, after the host reboots the same HBA is used.
  • If parent_fabric_wwn is used, after the host reboots an HBA on the same fabric is selected, regardless of the scsi_hostN used.
If you are using virsh to create the storage pool, continue with verifying that the storage pool was created.
Examples
The following are examples of XML files for vHBA-based storage pools. The first example is for an example of a storage pool that is the only storage pool on the HBA. The second example is for a storage pool that is one of several storage pools that use a single vHBA and uses the parent attribute to identify the SCSI host device.
<pool type='scsi'>
  <name>vhbapool_host3</name>
  <source>
    <adapter type='fc_host' wwnn='5001a4a93526d0a1' wwpn='5001a4ace3ee047d'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool> 
<pool type='scsi'>
  <name>vhbapool_host3</name>
  <source>
    <adapter type='fc_host' parent='scsi_host3' wwnn='5001a4a93526d0a1' wwpn='5001a4ace3ee047d'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool>  
The following is an example of a command for creating a vHBA-based storage pool:
# virsh pool-define-as vhbapool_host3 scsi --adapter-parent scsi_host3 --adapter-wwnn 5001a4a93526d0a1 --adapter-wwpn 5001a4ace3ee047d --target /dev/disk/by-path
Pool vhbapool_host3 defined

Note

The virsh command does not provide a way to define the parent_wwnn, parent_wwpn, or parent_fabric_wwn attributes.
Configuring a virtual machine to use a vHBA LUN
After a storage pool is created for a vHBA, the vHBA LUN must be added to the virtual machine configuration.
  1. Create a disk volume on the virtual machine in the virtual machine's XML.
  2. Specify the storage pool and the storage volume in the <source> parameter.
The following shows an example:
<disk type='volume' device='disk'>
  <driver name='qemu' type='raw'/>
  <source pool='vhbapool_host3' volume='unit:0:4:0'/>
  <target dev='hda' bus='ide'/>
</disk>    
To specify a lun device instead of a disk, see the following example:
<disk type='volume' device='lun' sgio='unfiltered'>
  <driver name='qemu' type='raw'/>
  <source pool='vhbapool_host3' volume='unit:0:4:0' mode='host'/>
  <target dev='sda' bus='scsi'/>
  <shareable />
</disk>
For XML configuration examples of adding SCSI LUN-based storage to a guest, see Section 13.3.6.3, “Adding SCSI LUN-based Storage to a Guest”.
Note that to ensure successful reconnection to a LUN in case of a hardware failure, it is recommended that you edit the fast_io_fail_tmo and dev_loss_tmo options. For more information, see Reconnecting to an exposed LUN after a hardware failure.

13.2.4. Deleting Storage Pools

You can delete storage pools using virsh or the Virtual Machine Manager.

13.2.4.1. Prerequisites for deleting a storage pool

To avoid negatively affecting other guest virtual machines that use the storage pool you want to delete, it is recommended that you stop the storage pool and release any resources being used by it.

13.2.4.2. Deleting storage pools using virsh

  1. List the defined storage pools:
    # virsh pool-list --all
    Name                 State      Autostart
    -----------------------------------------
    default              active     yes
    guest_images_pool    active     yes
    
  2. Stop the storage pool you want to delete.
    # virsh pool-destroy guest_images_disk
  3. (Optional) For some types of storage pools, you can optionally remove the directory where the storage pool resides:
    # virsh pool-delete guest_images_disk
  4. Remove the storage pool's definition.
    # virsh pool-undefine guest_images_disk
  5. Confirm the pool is undefined:
    # virsh pool-list --all
    Name                 State      Autostart
    -----------------------------------------
    default              active     yes
    
    

13.2.4.3. Deleting storage pools using Virtual Machine Manager

  1. Select the storage pool you want to delete in the storage pool list in the Storage tab of the Connection Details window.
  2. Click at the bottom of the Storage window. This stops the storage pool and releases any resources in use by it.
  3. Click .

    Note

    The icon is only enabled if the storage pool is stopped.
    The storage pool is deleted.

13.3. Using Storage Volumes

This section provides information about using storage volumes. It provides conceptual information, as well as detailed instructions on creating, configuring, and deleting storage volumes using virsh commands and the Virtual Machine Manager.

13.3.1. Storage Volume Concepts

Storage pools are divided into storage volumes. Storage volumes are abstractions of physical partitions, LVM logical volumes, file-based disk images, and other storage types handled by libvirt. Storage volumes are presented to guest virtual machines as local storage devices regardless of the underlying hardware.

Note

The sections below do not contain all of the possible commands and arguments that virsh provides for managing storage volumes> For more information, see Section 20.30, “Storage Volume Commands”.
On the host machine, a storage volume is referred to by its name and an identifier for the storage pool from which it derives. On the virsh command line, this takes the form --pool storage_pool volume_name.
For example, a volume named firstimage in the guest_images pool.
# virsh vol-info --pool guest_images firstimage
  Name:           firstimage
  Type:           block
  Capacity:       20.00 GB
  Allocation:     20.00 GB
For additional parameters and arguments, see Section 20.34, “Listing Volume Information”.

13.3.2. Creating Storage Volumes

This section provides general instructions for creating storage volumes from storage pools using virsh and the Virtual Machine Manager. After creating storage volumes, you can add storage devices to guests.

13.3.2.1. Creating Storage Volumes with virsh

Do one of the following:
  • Define the storage volume using an XML file.
    a. Create a temporary XML file containing the storage volume information required for the new device.
    The XML file must contain specific fields including the following:
    • name - The name of the storage volume.
    • allocation - The total storage allocation for the storage volume.
    • capacity - The logical capacity of the storage volume. If the volume is sparse, this value can differ from the allocation value.
    • target - The path to the storage volume on the host system and optionally its permissions and label.
    The following shows an example a storage volume definition XML file. In this example, the file is saved to ~/guest_volume.xml
      <volume>
        <name>volume1</name>
        <allocation>0</allocation>
        <capacity>20G</capacity>
        <target>
          <path>/var/lib/virt/images/sparse.img</path>
        </target>
      </volume> 
    b. Use the virsh vol-create command to create the storage volume based on the XML file.
    # virsh vol-create guest_images_dir ~/guest_volume.xml
      Vol volume1 created
    
    c. Delete the XML file created in step a.
  • Use the virsh vol-create-as command to create the storage volume.
    # virsh vol-create-as guest_images_dir volume1 20GB --allocation 0
  • Clone an existing storage volume using the virsh vol-clone command. The virsh vol-clone command must specify the storage pool that contains the storage volume to clone and the name of the newly created storage volume.
    # virsh vol-clone --pool guest_images_dir volume1 clone1

13.3.2.2. Creating storage volumes with Virtual Machine Manager

Procedure 13.11. Creating Storage Volumes with Virtual Machine Manager

  1. Open the storage settings

    1. In the Virtual Machine Manager, open the Edit menu and select Connection Details.
    2. Click the Storage tab in the Connection Details window.
      Storage tab

      Figure 13.11. Storage tab

      The pane on the left of the Connection Details window shows a list of storage pools.
  2. Select the storage pool in which you want to create a storage volume

    In the list of storage pools, click the storage pool in which you want to create the storage volume.
    Any storage volumes configured on the selected storage pool appear in the Volumes pane at the bottom of the window.
  3. Add a new storage volume

    Click the button above the Volumes list. The Add a Storage Volume dialog appears.
    Create storage volume

    Figure 13.12. Create storage volume

  4. Configure the storage volume

    Configure the storage volume with the following parameters:
    • Enter a name for the storage pool in the Name field.
    • Select a format for the storage volume from the Format list.
    • Enter the maximum size for the storage volume in the Max Capacity field.
  5. Finish the creation

    Click Finish. The Add a Storage Volume dialog closes, and the storage volume appears in the Volumes list.

13.3.3. Viewing Storage Volumes

You can create multiple storage volumes from a storage pool. You can also use the virsh vol-list command to list the storage volumes in a storage pool. In the following example, the guest_images_disk contains three volumes.
virsh vol-create-as guest_images_disk volume1 8G
Vol volume1 created

# virsh vol-create-as guest_images_disk volume2 8G
Vol volume2 created

# virsh vol-create-as guest_images_disk volume3 8G
Vol volume3 created

# virsh vol-list guest_images_disk
Name                 Path
-----------------------------------------
volume1              /home/VirtualMachines/guest_images_dir/volume1
volume2              /home/VirtualMachines/guest_images_dir/volume2
volume3              /home/VirtualMachines/guest_images_dir/volume3

13.3.4. Managing Data

This section provides information about managing the data on storage volumes.

Note

Some types of storage volumes do not support all of the data management commands. For specific information, see the sections below.

13.3.4.1. Wiping Storage Volumes

To ensure that data on a storage volume cannot be accessed, a storage volume can be wiped using the virsh vol-wipe command.
Use the virsh vol-wipe command to wipe a storage volume:
# virsh vol-wipe new-vol vdisk
By default, the data is overwritten by zeroes. However, there are a number of different methods that can be specified for wiping the storage volume. For detailed information about all of the options for the virsh vol-wipe command, refer to Section 20.32, “Deleting a Storage Volume's Contents”.

13.3.4.2. Uploading Data to a Storage Volume

You can upload data from a specified local file to a storage volume using the virsh vol-upload command.
# virsh vol-upload --pool pool-or-uuid --offset bytes --length bytes vol-name-or-key-or-path local-file
The following are the main virsh vol-upload options:
  • --pool pool-or-uuid - The name or UUID of the storage pool the volume is in.
  • vol-name-or-key-or-path - The name or key or path of the volume to upload.
  • --offset bytes The position in the storage volume at which to start writing the data.
  • --length length - An upper limit for the amount of data to be uploaded.

    Note

    An error will occur if local-file is greater than the specified --length.

Example 13.1. Uploading data to a storage volume

# virsh vol-upload sde1 /tmp/data500m.empty disk-pool
In this example sde1 is a volume in the disk-pool storage pool. The data in /tmp/data500m.empty is copied to sde1.

13.3.4.3. Downloading Data to a Storage Volume

You can download data from a storage volume to a specified local file using the virsh vol-download command.
# vol-download --pool pool-or-uuid --offset bytes --length bytes vol-name-or-key-or-path local-file
The following are the main virsh vol-download options:
  • --pool pool-or-uuid - The name or UUID of the storage pool that the volume is in.
  • vol-name-or-key-or-path - The name, key, or path of the volume to download.
  • --offset - The position in the storage volume at which to start reading the data.
  • --length length - An upper limit for the amount of data to be downloaded.

Example 13.2. Downloading data from a storage volume

# virsh vol-download sde1 /tmp/data-sde1.tmp disk-pool
In this example sde1 is a volume in the disk-pool storage pool. The data in sde1 is downloaded to /tmp/data-sde1.tmp.

13.3.4.4. Resizing Storage Volumes

You can resize the capacity of a specified storage volume using the vol-resize command.
# virsh vol-resize --pool pool-or-uuid vol-name-or-path pool-or-uuid capacity --allocate --delta --shrink
The capacity is expressed in bytes. The command requires --pool pool-or-uuid which is the name or UUID of the storage pool the volume is in. This command also requires vol-name-or-key-or-path, the name, key, or path of the volume to resize.
The new capacity might be sparse unless --allocate is specified. Normally, capacity is the new size, but if --delta is present, then it is added to the existing size. Attempts to shrink the volume will fail unless --shrink is present.
Note that capacity cannot be negative unless --shrink is provided and a negative sign is not necessary. capacity is a scaled integer which defaults to bytes if there is no suffix. In addition, note that this command is only safe for storage volumes not in use by an active guest. Refer to Section 20.13.3, “Changing the Size of a Guest Virtual Machine's Block Device” for live resizing.

Example 13.3. Resizing a storage volume

For example, if you created a 50M storage volume, you can resize it to 100M with the following command:
# virsh vol-resize --pool disk-pool sde1 100M

13.3.5. Deleting Storage Volumes

You can delete storage volumes using virsh or the Virtual Machine Manager.

Note

To avoid negatively affecting guest virtual machines that use the storage volume you want to delete, it is recommended that you release any resources using it.

13.3.5.1. Deleting storage volumes using virsh

Delete a storage volume using the virsh vol-delete command. The command must specify the name or path of the storage volume and the storage pool from which the storage volume is abstracted.
The following example deletes the volume_name storage volume from the guest_images_dir storage pool:
# virsh vol-delete volume_name --pool guest_images_dir
  vol volume_name deleted

13.3.5.2. Deleting storage volumes using Virtual Machine Manager

Procedure 13.12. Deleting Storage Volumes with Virtual Machine Manager

  1. Open the storage settings

    1. In the Virtual Machine Manager, open the Edit menu and select Connection Details.
    2. Click the Storage tab in the Connection Details window.
      Storage tab

      Figure 13.13. Storage tab

      The pane on the left of the Connection Details window shows a list of storage pools.
  2. Select the storage volume you want to delete

    1. In the list of storage pools, click the storage pool from which the storage volume is abstracted.
      A list of storage volumes configured on the selected storage pool appear in the Volumes pane at the bottom of the window.
    2. Select the storage volume you want to delete.
  3. Delete the storage volume

    1. Click the button (above the Volumes list). A confirmation dialog appears.
    2. Click Yes. The selected storage volume is deleted.

13.3.6. Adding Storage Devices to Guests

You can add storage devices to guest virtual machines using virsh or Virtual Machine Manager.

13.3.6.1. Adding Storage Devices to Guests Using virsh

To add storage devices to a guest virtual machine, use the attach-disk command. The arguments that contain information about the disk to add can be specified in an XML file or on the command line.
The following is a sample XML file with the definition of the storage.
<disk type='file' device='disk>'>
  <driver name='qemu' type='raw' cache='none'/>
  <source file='/var/lib/libvirt/images/FileName.img'/>
  <target dev='vdb' bus='virtio'/>
</disk> 
The following command attaches a disk to Guest1 using an XML file called NewStorage.xml.
# virsh attach-disk --config Guest1 ~/NewStorage.xml
The following command attaches a disk to Guest1 without using an xml file.
# virsh attach-disk --config Guest1 --source /var/lib/libvirt/images/FileName.img --target vdb

13.3.6.2. Adding Storage Devices to Guests Using Virtual Machine Manager

You can add a storage volume to a guest virtual machine or create and add a default storage device to a guest virtual machine.
13.3.6.2.1. Adding a storage volume to a guest
To add a storage volume to a guest virtual machine:
  1. Open Virtual Machine Manager to the virtual machine hardware details window

    Open virt-manager by executing the virt-manager command as root or opening ApplicationsSystem ToolsVirtual Machine Manager.
    The Virtual Machine Manager window

    Figure 13.14. The Virtual Machine Manager window

    Select the guest virtual machine to which you want to add a storage volume.
    Click Open. The Virtual Machine window opens.
    Click . The hardware details window appears.
    The Hardware Details window

    Figure 13.15. The Hardware Details window

  2. Open the Add New Virtual Hardware window

    Click Add Hardware. The Add New Virtual Hardware window appears.
    Ensure that Storage is selected in the hardware type pane.
    The Add New Virtual Hardware window

    Figure 13.16. The Add New Virtual Hardware window

  3. View a list of storage volumes

    Select the Select or create custom storage option button.
    Click Manage. The Choose Storage Volume dialog appears.
    The Select Storage Volume window

    Figure 13.17. The Select Storage Volume window

  4. Select a storage volume

    Select a storage pool from the list on the left side of the Select Storage Volume window. A list of storage volumes in the selected storage pool appears in the Volumes list.

    Note

    You can create a storage pool from the Select Storage Volume window. For more information, see Section 13.2.2.2, “Creating storage pools with Virtual Machine Manager”.
    Select a storage volume from the Volumes list.

    Note

    You can create a storage volume from the Select Storage Volume window. For more information, see Section 13.3.2.2, “Creating storage volumes with Virtual Machine Manager”.
    Click Choose Volume. The Select Storage Volume window closes.
  5. Configure the storage volume

    Select a device type from the Device type list. Available types are: Disk device, CDROM device, Floppy device, LUN Passthrough
    Select a bus type from the Bus type list. The available bus types are dependent on the selected device type.
    Select a cache mode from the Cache mode list. Available cache modes are: Hypervisor default, none, writethrough, writeback, directsync, unsafe
    Click Finish. The Add New Virtual Hardware window closes.
13.3.6.2.2. Adding default storage to a guest
The default storage pool is a file-based image in /var/lib/libvirt/images/ directory.
To add default storage to a guest virtual machine:
  1. Open Virtual Machine Manager to the virtual machine hardware details window

    Open virt-manager by executing the virt-manager command as root or opening ApplicationsSystem ToolsVirtual Machine Manager.
    The Virtual Machine Manager window

    Figure 13.18. The Virtual Machine Manager window

    Select the guest virtual machine to which you want to add a storage volume.
    Click Open. The Virtual Machine window opens.
    Click . The hardware details window appears.
    The Hardware Details window

    Figure 13.19. The Hardware Details window

  2. Open the Add New Virtual Hardware window

    Click Add Hardware. The Add New Virtual Hardware window appears.
    Ensure that Storage is selected in the hardware type pane.
    The Add New Virtual Hardware window

    Figure 13.20. The Add New Virtual Hardware window

  3. Create a disk for the guest

    Ensure that the Create a disk image for the virtual machine option.
    Enter the size of the disk to create in the textbox below the Create a disk image for the virtual machine option button.
    Click Finish. The Add New Virtual Hardware window closes.

13.3.6.3. Adding SCSI LUN-based Storage to a Guest

There are multiple ways to expose a host SCSI LUN entirely to the guest. Exposing the SCSI LUN to the guest provides the capability to execute SCSI commands directly to the LUN on the guest. This is useful as a means to share a LUN between guests, as well as to share Fibre Channel storage between hosts.
For more information on SCSI LUN-based storage, see vHBA-based storage pools using SCSI devices.

Important

The optional sgio attribute controls whether unprivileged SCSI Generical I/O (SG_IO) commands are filtered for a device='lun' disk. The sgio attribute can be specified as 'filtered' or 'unfiltered', but must be set to 'unfiltered' to allow SG_IO ioctl commands to be passed through on the guest in a persistent reservation.
In addition to setting sgio='unfiltered', the <shareable> element must be set to share a LUN between guests. The sgio attribute defaults to 'filtered' if not specified.
The <disk> XML attribute device='lun' is valid for the following guest disk configurations:
  • type='block' for <source dev='/dev/disk/by-{path|id|uuid|label}'/>
    <disk type='block' device='lun' sgio='unfiltered'>
    ​  <driver name='qemu' type='raw'/>
    ​  <source dev='/dev/disk/by-path/pci-0000\:04\:00.1-fc-0x203400a0b85ad1d7-lun-0'/>
    ​  <target dev='sda' bus='scsi'/>
      <shareable/>
    ​</disk>

    Note

    The backslashes prior to the colons in the <source> device name are required.
  • type='network' for <source protocol='iscsi'... />
    <disk type='network' device='lun' sgio='unfiltered'>
      <driver name='qemu' type='raw'/>
      <source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-net-pool/1'>
        <host name='example.com' port='3260'/>
        <auth username='myuser'>
          <secret type='iscsi' usage='libvirtiscsi'/>
        </auth>
      </source>
      <target dev='sda' bus='scsi'/>
      <shareable/>
    </disk> 
  • type='volume' when using an iSCSI or a NPIV/vHBA source pool as the SCSI source pool.
    The following example XML shows a guest using an iSCSI source pool (named iscsi-net-pool) as the SCSI source pool:
    <disk type='volume' device='lun' sgio='unfiltered'>
      <driver name='qemu' type='raw'/>
      <source pool='iscsi-net-pool' volume='unit:0:0:1' mode='host'/>
      <target dev='sda' bus='scsi'/>
      <shareable/>
    </disk> 

    Note

    The mode= option within the <source> tag is optional, but if used, it must be set to 'host' and not 'direct'. When set to 'host', libvirt will find the path to the device on the local host. When set to 'direct', libvirt will generate the path to the device using the source pool's source host data.
    The iSCSI pool (iscsi-net-pool) in the example above will have a similar configuration to the following:
    # virsh pool-dumpxml iscsi-net-pool
    <pool type='iscsi'>
      <name>iscsi-net-pool</name>
      <capacity unit='bytes'>11274289152</capacity>
      <allocation unit='bytes'>11274289152</allocation>
      <available unit='bytes'>0</available>
      <source>
        <host name='192.168.122.1' port='3260'/>
        <device path='iqn.2013-12.com.example:iscsi-chap-netpool'/>
        <auth type='chap' username='redhat'>
          <secret usage='libvirtiscsi'/>
        </auth>
      </source>
      <target>
        <path>/dev/disk/by-path</path>
        <permissions>
          <mode>0755</mode>
        </permissions>
      </target>
    </pool> 
    To verify the details of the available LUNs in the iSCSI source pool, enter the following command:
    # virsh vol-list iscsi-net-pool
    Name                 Path
    ------------------------------------------------------------------------------
    unit:0:0:1           /dev/disk/by-path/ip-192.168.122.1:3260-iscsi-iqn.2013-12.com.example:iscsi-chap-netpool-lun-1
    unit:0:0:2           /dev/disk/by-path/ip-192.168.122.1:3260-iscsi-iqn.2013-12.com.example:iscsi-chap-netpool-lun-2
  • type='volume' when using a NPIV/vHBA source pool as the SCSI source pool.
    The following example XML shows a guest using a NPIV/vHBA source pool (named vhbapool_host3) as the SCSI source pool:
    <disk type='volume' device='lun' sgio='unfiltered'>
      <driver name='qemu' type='raw'/>
      <source pool='vhbapool_host3' volume='unit:0:1:0'/>
      <target dev='sda' bus='scsi'/>
      <shareable/>
    </disk>  
    The NPIV/vHBA pool (vhbapool_host3) in the example above will have a similar configuration to:
    # virsh pool-dumpxml vhbapool_host3
    <pool type='scsi'>
      <name>vhbapool_host3</name>
      <capacity unit='bytes'>0</capacity>
      <allocation unit='bytes'>0</allocation>
      <available unit='bytes'>0</available>
      <source>
        <adapter type='fc_host' parent='scsi_host3' managed='yes' wwnn='5001a4a93526d0a1' wwpn='5001a4ace3ee045d'/>
      </source>
      <target>
        <path>/dev/disk/by-path</path>
        <permissions>
          <mode>0700</mode>
          <owner>0</owner>
          <group>0</group>
        </permissions>
      </target>
    </pool>  
    To verify the details of the available LUNs on the vHBA, enter the following command:
    # virsh vol-list vhbapool_host3
    Name                 Path
    ------------------------------------------------------------------------------
    unit:0:0:0           /dev/disk/by-path/pci-0000:10:00.0-fc-0x5006016044602198-lun-0
    unit:0:1:0           /dev/disk/by-path/pci-0000:10:00.0-fc-0x5006016844602198-lun-0
    For more information on using a NPIV vHBA with SCSI devices, see Section 13.2.3.8, “vHBA-based storage pools using SCSI devices”.
The following procedure shows an example of adding a SCSI LUN-based storage device to a guest. Any of the above <disk device='lun'> guest disk configurations can be attached with this method. Substitute configurations according to your environment.

Procedure 13.13. Attaching SCSI LUN-based storage to a guest

  1. Create the device file by writing a <disk> element in a new file, and save this file with an XML extension (in this example, sda.xml):
    # cat sda.xml
    <disk type='volume' device='lun' sgio='unfiltered'>
      <driver name='qemu' type='raw'/>
      <source pool='vhbapool_host3' volume='unit:0:1:0'/>
      <target dev='sda' bus='scsi'/>
      <shareable/>
    </disk>  
  2. Associate the device created in sda.xml with your guest virtual machine (Guest1, for example):
    # virsh attach-device --config Guest1 ~/sda.xml

    Note

    Running the virsh attach-device command with the --config option requires a guest reboot to add the device permanently to the guest. Alternatively, the --persistent option can be used instead of --config, which can also be used to hotplug the device to a guest.
Alternatively, the SCSI LUN-based storage can be attached or configured on the guest using virt-manager. To configure this using virt-manager, click the Add Hardware button and add a virtual disk with the required parameters, or change the settings of an existing SCSI LUN device from this window. In Red Hat Enterprise Linux 7.2 and above, the SGIO value can also be configured in virt-manager:
Configuring SCSI LUN storage with virt-manager

Figure 13.21. Configuring SCSI LUN storage with virt-manager

Reconnecting to an exposed LUN after a hardware failure

If the connection to an exposed Fiber Channel (FC) LUN is lost due to a failure of hardware (such as the host bus adapter), the exposed LUNs on the guest may continue to appear as failed even after the hardware failure is fixed. To prevent this, edit the dev_loss_tmo and fast_io_fail_tmo kernel options:
  • dev_loss_tmo controls how long the SCSI layer waits after a SCSI device fails before marking it as failed. To prevent a timeout, it is recommended to set the option to the maximum value, which is 2147483647.
  • fast_io_fail_tmo controls how long the SCSI layer waits after a SCSI device fails before failing back to the I/O. To ensure that dev_loss_tmo is not ignored by the kernel, set this option's value to any number lower than the value of dev_loss_tmo.
To modify the value of dev_loss_tmo and fast_io_fail, do one of the following:
  • Edit the /etc/multipath.conf file, and set the values in the defaults section:
    defaults {
    ...
    fast_io_fail_tmo     20
    dev_loss_tmo    infinity
    }
    
  • Set dev_loss_tmo and fast_io_fail on the level of the FC host or remote port, for example as follows:
    # echo 20 > /sys/devices/pci0000:00/0000:00:06.0/0000:13:00.0/host1/rport-1:0-0/fc_remote_ports/rport-1:0-0/fast_io_fail_tmo
    # echo 2147483647 > /sys/devices/pci0000:00/0000:00:06.0/0000:13:00.0/host1/rport-1:0-0/fc_remote_ports/rport-1:0-0/dev_loss_tmo
To verify that the new values of dev_loss_tmo and fast_io_fail are active, use the following command:
# find /sys -name dev_loss_tmo -print -exec cat {} \;
If the parameters have been set correctly, the output will look similar to this, with the appropriate device or devices instead of pci0000:00/0000:00:06.0/0000:13:00.0/host1/rport-1:0-0/fc_remote_ports/rport-1:0-0:
# find /sys -name dev_loss_tmo -print -exec cat {} \;
...
/sys/devices/pci0000:00/0000:00:06.0/0000:13:00.0/host1/rport-1:0-0/fc_remote_ports/rport-1:0-0/dev_loss_tmo
2147483647
...

13.3.6.4. Managing Storage Controllers in a Guest Virtual Machine

Unlike virtio disks, SCSI devices require the presence of a controller in the guest virtual machine. This section details the necessary steps to create a virtual SCSI controller (also known as "Host Bus Adapter", or HBA), and to add SCSI storage to the guest virtual machine.

Procedure 13.14. Creating a virtual SCSI controller

  1. Display the configuration of the guest virtual machine (Guest1) and look for a pre-existing SCSI controller:
    virsh dumpxml Guest1 | grep controller.*scsi
    If a device controller is present, the command will output one or more lines similar to the following:
    <controller type='scsi' model='virtio-scsi' index='0'/>
    
  2. If the previous step did not show a device controller, create the description for one in a new file and add it to the virtual machine, using the following steps:
    1. Create the device controller by writing a <controller> element in a new file and save this file with an XML extension. virtio-scsi-controller.xml, for example.
      <controller type='scsi' model='virtio-scsi'/>
      
    2. Associate the device controller you just created in virtio-scsi-controller.xml with your guest virtual machine (Guest1, for example):
      virsh attach-device --config Guest1 ~/virtio-scsi-controller.xml
      In this example the --config option behaves the same as it does for disks. See Section 13.3.6, “Adding Storage Devices to Guests” for more information.
  3. Add a new SCSI disk or CD-ROM. The new disk can be added using the methods in Section 13.3.6, “Adding Storage Devices to Guests”. In order to create a SCSI disk, specify a target device name that starts with sd.

    Note

    The supported limit for each controller is 1024 virtio-scsi disks, but it is possible that other available resources in the host (such as file descriptors) are exhausted with fewer disks.
    For more information, see the following Red Hat Enterprise Linux 6 whitepaper: The next-generation storage interface for the Red Hat Enterprise Linux Kernel Virtual Machine: virtio-scsi.
    virsh attach-disk Guest1 /var/lib/libvirt/images/FileName.img sdb --cache none
    Depending on the version of the driver in the guest virtual machine, the new disk may not be detected immediately by a running guest virtual machine. Follow the steps in the Red Hat Enterprise Linux Storage Administration Guide.

13.3.7. Removing Storage Devices from Guests

You can remove storage device from virtual guest machines using virsh or Virtual Machine Manager.

13.3.7.1. Removing Storage from a Virtual Machine with virsh

The following example removes the vdb storage volume from the Guest1 virtual machine:
# virsh detach-disk Guest1 vdb

13.3.7.2. Removing Storage from a Virtual Machine with Virtual Machine Manager

Procedure 13.15. Removing storage from a virtual machine with Virtual Machine Manager

To remove storage from a guest virtual machine using Virtual Machine Manager:
  1. Open Virtual Machine Manager to the virtual machine hardware details window

    Open virt-manager by executing the virt-manager command as root or opening ApplicationsSystem ToolsVirtual Machine Manager.
    Select the guest virtual machine from which you want to remove a storage device.
    Click Open. The Virtual Machine window opens.
    Click . The hardware details window appears.
  2. Remove the storage from the guest virtual machine

    Select the storage device from the list of hardware on the left side of the hardware details pane.
    Click Remove. A confirmation dialog appears.
    Click Yes. The storage is removed from the guest virtual machine.

Chapter 14. Using qemu-img

The qemu-img command-line tool is used for formatting, modifying, and verifying various file systems used by KVM. qemu-img options and usages are highlighted in the sections that follow.

Warning

Never use qemu-img to modify images in use by a running virtual machine or any other process. This may destroy the image. Also, be aware that querying an image that is being modified by another process may encounter inconsistent state.

14.1. Checking the Disk Image

To perform a consistency check on a disk image with the file name imgname.
# qemu-img check [-f format] imgname

Note

Only a selected group of formats support consistency checks. These include qcow2, vdi, vhdx, vmdk, and qed.

14.2. Committing Changes to an Image

Commit any changes recorded in the specified image file (imgname) to the file's base image with the qemu-img commit command. Optionally, specify the file's format type (fmt).
 # qemu-img commit [-f fmt] [-t cache] imgname

14.3. Comparing Images

Compare the contents of two specified image files (imgname1 and imgname2) with the qemu-img compare command. Optionally, specify the files' format types (fmt). The images can have different formats and settings.
By default, images with different sizes are considered identical if the larger image contains only unallocated or zeroed sectors in the area after the end of the other image. In addition, if any sector is not allocated in one image and contains only zero bytes in the other one, it is evaluated as equal. If you specify the -s option, the images are not considered identical if the image sizes differ or a sector is allocated in one image and is not allocated in the second one.
 # qemu-img compare [-f fmt] [-F fmt] [-p] [-s] [-q] imgname1 imgname2
The qemu-img compare command exits with one of the following exit codes:
  • 0 - The images are identical
  • 1 - The images are different
  • 2 - There was an error opening one of the images
  • 3 - There was an error checking a sector allocation
  • 4 - There was an error reading the data

14.4. Mapping an Image

Using the qemu-img map command, your can dump the metadata of the specified image file (imgname) and its backing file chain. The dump shows the allocation state of every sector in the (imgname) with the topmost file that allocates it in the backing file chain. Optionally, specify the file's format type (fmt).
 # qemu-img map [-f fmt] [--output=fmt] imgname
There are two output formats, the human format and the json format:

14.4.1. The human Format

The default format (human) only dumps non-zero, allocated parts of the file. The output identifies a file from where data can be read and the offset in the file. Each line includes four fields. The following shows an example of an output:
Offset          Length          Mapped to       File
0               0x20000         0x50000         /tmp/overlay.qcow2
0x100000        0x10000         0x95380000      /tmp/backing.qcow2
The first line means that 0x20000 (131072) bytes starting at offset 0 in the image are available in tmp/overlay.qcow2 (opened in raw format) starting at offset 0x50000 (327680). Data that is compressed, encrypted, or otherwise not available in raw format causes an error if human format is specified.

Note

File names can include newline characters. Therefore, it is not safe to parse output in human format in scripts.

14.4.2. The json Format

If the json option is specified, the output returns an array of dictionaries in JSON format. In addition to the information provided in the human option, the output includes the following information:
  • data - A Boolean field that shows whether or not the sectors contain data
  • zero - A Boolean field that shows whether or not the data is known to read as zero
  • depth - The depth of the backing file of filename

Note

When the json option is specified, the offset field is optional.
For more information about the qemu-img map command and additional options, see the relevant man page.

14.5. Amending an Image

Amend the image format-specific options for the image file. Optionally, specify the file's format type (fmt).
# qemu-img amend [-p] [-f fmt] [-t cache] -o options filename

Note

This operation is only supported for the qcow2 file format.

14.6. Converting an Existing Image to Another Format

The convert option is used to convert one recognized image format to another image format. For a list of accepted formats, see Section 14.12, “Supported qemu-img Formats”.
# qemu-img convert [-c] [-p] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-S sparse_size] filename output_filename
The -p parameter shows the progress of the command (optional and not for every command) and -S flag allows for the creation of a sparse file, which is included within the disk image. Sparse files in all purposes function like a standard file, except that the physical blocks that only contain zeros (that is, nothing). When the Operating System sees this file, it treats it as it exists and takes up actual disk space, even though in reality it does not take any. This is particularly helpful when creating a disk for a guest virtual machine as this gives the appearance that the disk has taken much more disk space than it has. For example, if you set -S to 50Gb on a disk image that is 10Gb, then your 10Gb of disk space will appear to be 60Gb in size even though only 10Gb is actually being used.
Convert the disk image filename to disk image output_filename using format output_format. The disk image can be optionally compressed with the -c option, or encrypted with the -o option by setting -o encryption. Note that the options available with the -o parameter differ with the selected format.
Only the qcow2 and qcow2 format supports encryption or compression. qcow2 encryption uses the AES format with secure 128-bit keys. qcow2 compression is read-only, so if a compressed sector is converted from qcow2 format, it is written to the new format as uncompressed data.
Image conversion is also useful to get a smaller image when using a format which can grow, such as qcow or cow. The empty sectors are detected and suppressed from the destination image.

14.7. Creating and Formatting New Images or Devices

Create the new disk image filename of size size and format format.
# qemu-img create [-f format] [-o options] filename [size]
If a base image is specified with -o backing_file=filename, the image will only record differences between itself and the base image. The backing file will not be modified unless you use the commit command. No size needs to be specified in this case.

14.8. Displaying Image Information

The info parameter displays information about a disk image filename. The format for the info option is as follows:
# qemu-img info [-f format] filename
This command is often used to discover the size reserved on disk which can be different from the displayed size. If snapshots are stored in the disk image, they are displayed also. This command will show for example, how much space is being taken by a qcow2 image on a block device. This is done by running the qemu-img. You can check that the image in use is the one that matches the output of the qemu-img info command with the qemu-img check command.
# qemu-img info /dev/vg-90.100-sluo/lv-90-100-sluo
image: /dev/vg-90.100-sluo/lv-90-100-sluo
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 0
cluster_size: 65536

14.9. Rebasing a Backing File of an Image

The qemu-img rebase changes the backing file of an image.
# qemu-img rebase [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F backing_fmt] filename
The backing file is changed to backing_file and (if the format of filename supports the feature), the backing file format is changed to backing_format.

Note

Only the qcow2 format supports changing the backing file (rebase).
There are two different modes in which rebase can operate: safe and unsafe.
safe mode is used by default and performs a real rebase operation. The new backing file may differ from the old one and the qemu-img rebase command will take care of keeping the guest virtual machine-visible content of filename unchanged. In order to achieve this, any clusters that differ between backing_file and old backing file of filename are merged into filename before making any changes to the backing file.
Note that safe mode is an expensive operation, comparable to converting an image. The old backing file is required for it to complete successfully.
unsafe mode is used if the -u option is passed to qemu-img rebase. In this mode, only the backing file name and format of filename is changed, without any checks taking place on the file contents. Make sure the new backing file is specified correctly or the guest-visible content of the image will be corrupted.
This mode is useful for renaming or moving the backing file. It can be used without an accessible old backing file. For instance, it can be used to fix an image whose backing file has already been moved or renamed.

14.10. Re-sizing the Disk Image

Change the disk image filename as if it had been created with size size. Only images in raw format can be resized in both directions, whereas qcow2 version 2 or qcow2 version 3 images can be grown but cannot be shrunk.
Use the following to set the size of the disk image filename to size bytes:
# qemu-img resize filename size
You can also resize relative to the current size of the disk image. To give a size relative to the current size, prefix the number of bytes with + to grow, or - to reduce the size of the disk image by that number of bytes. Adding a unit suffix allows you to set the image size in kilobytes (K), megabytes (M), gigabytes (G) or terabytes (T).
# qemu-img resize filename [+|-]size[K|M|G|T]

Warning

Before using this command to shrink a disk image, you must use file system and partitioning tools inside the VM itself to reduce allocated file systems and partition sizes accordingly. Failure to do so will result in data loss.
After using this command to grow a disk image, you must use file system and partitioning tools inside the VM to actually begin using the new space on the device.

14.11. Listing, Creating, Applying, and Deleting a Snapshot

Using different parameters from the qemu-img snapshot command you can list, apply, create, or delete an existing snapshot (snapshot) of specified image (filename).
# qemu-img snapshot [ -l | -a snapshot | -c snapshot | -d snapshot ] filename
The accepted arguments are as follows:
  • -l lists all snapshots associated with the specified disk image.
  • The apply option, -a, reverts the disk image (filename) to the state of a previously saved snapshot.
  • -c creates a snapshot (snapshot) of an image (filename).
  • -d deletes the specified snapshot.

14.12. Supported qemu-img Formats

When a format is specified in any of the qemu-img commands, the following format types may be used:
  • raw - Raw disk image format (default). This can be the fastest file-based format. If your file system supports holes (for example in ext2 or ext3 ), then only the written sectors will reserve space. Use qemu-img info to obtain the real size used by the image or ls -ls on Unix/Linux. Although Raw images give optimal performance, only very basic features are available with a Raw image. For example, no snapshots are available.
  • qcow2 - QEMU image format, the most versatile format with the best feature set. Use it to have optional AES encryption, zlib-based compression, support of multiple VM snapshots, and smaller images, which are useful on file systems that do not support holes . Note that this expansive feature set comes at the cost of performance.
    Although only the formats above can be used to run on a guest virtual machine or host physical machine, qemu-img also recognizes and supports the following formats in order to convert from them into either raw , or qcow2 format. The format of an image is usually detected automatically. In addition to converting these formats into raw or qcow2 , they can be converted back from raw or qcow2 to the original format. Note that the qcow2 version supplied with Red Hat Enterprise Linux 7 is 1.1. The format that is supplied with previous versions of Red Hat Enterprise Linux will be 0.10. You can revert image files to previous versions of qcow2. To know which version you are using, run qemu-img info qcow2 [imagefilename.img] command. To change the qcow version see Section 23.20.2, “Setting Target Elements”.
  • bochs - Bochs disk image format.
  • cloop - Linux Compressed Loop image, useful only to reuse directly compressed CD-ROM images present for example in the Knoppix CD-ROMs.
  • cow - User Mode Linux Copy On Write image format. The cow format is included only for compatibility with previous versions.
  • dmg - Mac disk image format.
  • nbd - Network block device.
  • parallels - Parallels virtualization disk image format.
  • qcow - Old QEMU image format. Only included for compatibility with older versions.
  • qed - Old QEMU image format. Only included for compatibility with older versions.
  • vdi - Oracle VM VirtualBox hard disk image format.
  • vhdx - Microsoft Hyper-V virtual hard disk-X disk image format.
  • vmdk - VMware 3 and 4 compatible image format.
  • vvfat - Virtual VFAT disk image format.

Chapter 15. KVM Migration

This chapter covers the migration guest virtual machines from one host physical machine that runs the KVM hypervisor to another. Migrating guests is possible because virtual machines run in a virtualized environment instead of directly on the hardware.

15.1. Migration Definition and Benefits

Migration works by sending the state of the guest virtual machine's memory and any virtualized devices to a destination host physical machine. It is recommended to use shared, networked storage to store the guest's images to be migrated. It is also recommended to use libvirt-managed storage pools for shared storage when migrating virtual machines.
Migrations can be performed both with live (running) and non-live (shut-down) guests.
In a live migration, the guest virtual machine continues to run on the source host machine, while the guest's memory pages are transferred to the destination host machine. During migration, KVM monitors the source for any changes in pages it has already transferred, and begins to transfer these changes when all of the initial pages have been transferred. KVM also estimates transfer speed during migration, so when the remaining amount of data to transfer will reaches a certain configurable period of time (10ms by default), KVM suspends the original guest virtual machine, transfers the remaining data, and resumes the same guest virtual machine on the destination host physical machine.
In contrast, a non-live migration (offline migration) suspends the guest virtual machine and then copies the guest's memory to the destination host machine. The guest is then resumed on the destination host machine and the memory the guest used on the source host machine is freed. The time it takes to complete such a migration only depends on network bandwidth and latency. If the network is experiencing heavy use or low bandwidth, the migration will take much longer. Note that if the original guest virtual machine modifies pages faster than KVM can transfer them to the destination host physical machine, offline migration must be used, as live migration would never complete.
Migration is useful for:
Load balancing
Guest virtual machines can be moved to host physical machines with lower usage if their host machine becomes overloaded, or if another host machine is under-utilized.
Hardware independence
When you need to upgrade, add, or remove hardware devices on the host physical machine, you can safely relocate guest virtual machines to other host physical machines. This means that guest virtual machines do not experience any downtime for hardware improvements.
Energy saving
Virtual machines can be redistributed to other host physical machines, and the unloaded host systems can thus be powered off to save energy and cut costs in low usage periods.
Geographic migration
Virtual machines can be moved to another location for lower latency or when required by other reasons.

15.2. Migration Requirements and Limitations

Before using KVM migration, make sure that your system fulfills the migration's requirements, and that you are aware of its limitations.

Migration requirements

  • A guest virtual machine installed on shared storage using one of the following protocols:
    • Fibre Channel-based LUNs
    • iSCSI
    • NFS
    • GFS2
    • SCSI RDMA protocols (SCSI RCP): the block export protocol used in Infiniband and 10GbE iWARP adapters
  • Make sure that the libvirtd service is enabled and running.
    # systemctl enable libvirtd.service
    # systemctl restart libvirtd.service
  • The ability to migrate effectively is dependant on the parameter setting in the /etc/libvirt/libvirtd.conf file. To edit this file, use the following procedure:

    Procedure 15.1. Configuring libvirtd.conf

    1. Opening the libvirtd.conf requires running the command as root:
      # vim /etc/libvirt/libvirtd.conf
    2. Change the parameters as needed and save the file.
    3. Restart the libvirtd service:
      # systemctl restart libvirtd
  • The migration platforms and versions should be checked against Table 15.1, “Live Migration Compatibility”
  • Use a separate system exporting the shared storage medium. Storage should not reside on either of the two host physical machines used for the migration.
  • Shared storage must mount at the same location on source and destination systems. The mounted directory names must be identical. Although it is possible to keep the images using different paths, it is not recommended. Note that, if you intend to use virt-manager to perform the migration, the path names must be identical. If you intend to use virsh to perform the migration, different network configurations and mount directories can be used with the help of --xml option or pre-hooks . For more information on pre-hooks, see the libvirt upstream documentation, and for more information on the XML option, see Chapter 23, Manipulating the Domain XML.
  • When migration is attempted on an existing guest virtual machine in a public bridge+tap network, the source and destination host machines must be located on the same network. Otherwise, the guest virtual machine network will not operate after migration.

Migration Limitations

  • Guest virtual machine migration has the following limitations when used on Red Hat Enterprise Linux with virtualization technology based on KVM:
    • Point to point migration – must be done manually to designate destination hypervisor from originating hypervisor
    • No validation or roll-back is available
    • Determination of target may only be done manually
    • Storage migration cannot be performed live on Red Hat Enterprise Linux 7, but you can migrate storage while the guest virtual machine is powered down. Live storage migration is available on Red Hat Virtualization. Call your service representative for details.

Note

If you are migrating a guest machine that has virtio devices on it, make sure to set the number of vectors on any virtio device on either platform to 32 or fewer. For detailed information, see Section 23.18, “Devices”.

15.3. Live Migration and Red Hat Enterprise Linux Version Compatibility

Live Migration is supported as shown in Table 15.1, “Live Migration Compatibility”:

Table 15.1. Live Migration Compatibility

Migration Method Release Type Example Live Migration Support Notes
Forward Major release 6.5+ → 7.x Fully supported Any issues should be reported
Backward Major release 7.x → 6.y Not supported
ForwardMinor release7.x → 7.y (7.0 → 7.1)Fully supportedAny issues should be reported
BackwardMinor release7.y → 7.x (7.1 → 7.0)Fully supportedAny issues should be reported

Troubleshooting problems with migration

  • Issues with the migration protocol — If backward migration ends with "unknown section error", repeating the migration process can repair the issue as it may be a transient error. If not, report the problem.
  • Issues with audio devices — When migrating from Red Hat Enterprise Linux 6.x to Red Hat Enterprise Linux 7.y, note that the es1370 audio card is no longer supported. Use the ac97 audio card instead.
  • Issues with network cards — When migrating from Red Hat Enterprise Linux 6.x to Red Hat Enterprise Linux 7.y, note that the pcnet and ne2k_pci network cards are no longer supported. Use the virtio-net network device instead.
Configuring Network Storage

Configure shared storage and install a guest virtual machine on the shared storage.

15.4. Shared Storage Example: NFS for a Simple Migration

Important

This example uses NFS to share guest virtual machine images with other KVM host physical machines. Although not practical for large installations, it is presented to demonstrate migration techniques only. Do not use this example for migrating or running more than a few guest virtual machines. In addition, it is required that the synch parameter is enabled. This is required for proper export of the NFS storage.
iSCSI storage is a better choice for large deployments. For configuration details, see Section 13.2.3.5, “iSCSI-based storage pools”.
For detailed information on configuring NFS, opening IP tables, and configuring the firewall, see Red Hat Linux Storage Administration Guide.
Make sure that NFS file locking is not used as it is not supported in KVM.
  1. Export your libvirt image directory

    Migration requires storage to reside on a system that is separate to the migration target systems. On this separate system, export the storage by adding the default image directory to the /etc/exports file:
    /var/lib/libvirt/images *.example.com(rw,no_root_squash,sync)
    Change the hostname parameter as required for your environment.
  2. Start NFS

    1. Install the NFS packages if they are not yet installed:
      # yum install nfs-utils
    2. Make sure that the ports for NFS in iptables (2049, for example) are opened and add NFS to the /etc/hosts.allow file.
    3. Start the NFS service:
      # systemctl start nfs-server
  3. Mount the shared storage on the source and the destination

    On the migration source and the destination systems, mount the /var/lib/libvirt/images directory:
    # mount storage_host:/var/lib/libvirt/images /var/lib/libvirt/images

    Warning

    Whichever directory is chosen for the source host physical machine must be exactly the same as that on the destination host physical machine. This applies to all types of shared storage. The directory must be the same or the migration with virt-manager will fail.

15.5. Live KVM Migration with virsh

A guest virtual machine can be migrated to another host physical machine with the virsh command. The migrate command accepts parameters in the following format:
# virsh migrate --live GuestName DestinationURL
Note that the --live option may be eliminated when live migration is not required. Additional options are listed in Section 15.5.2, “Additional Options for the virsh migrate Command”.
The GuestName parameter represents the name of the guest virtual machine which you want to migrate.
The DestinationURL parameter is the connection URL of the destination host physical machine. The destination system must run the same version of Red Hat Enterprise Linux, be using the same hypervisor and have libvirt running.

Note

The DestinationURL parameter for normal migration and peer2peer migration has different semantics:
  • normal migration: the DestinationURL is the URL of the target host physical machine as seen from the source guest virtual machine.
  • peer2peer migration: DestinationURL is the URL of the target host physical machine as seen from the source host physical machine.
Once the command is entered, you will be prompted for the root password of the destination system.

Important

Name resolution must be working on both sides (source and destination) in order for migration to succeed. Each side must be able to find the other. Make sure that you can ping one side to the other to check that the name resolution is working.
Example: live migration with virsh

This example migrates from host1.example.com to host2.example.com. Change the host physical machine names for your environment. This example migrates a virtual machine named guest1-rhel6-64.

This example assumes you have fully configured shared storage and meet all the prerequisites (listed here: Migration requirements).
  1. Verify the guest virtual machine is running

    From the source system, host1.example.com, verify guest1-rhel6-64 is running:
    [root@host1 ~]# virsh list
    Id Name                 State
    ----------------------------------
     10 guest1-rhel6-64     running
    
  2. Migrate the guest virtual machine

    Execute the following command to live migrate the guest virtual machine to the destination, host2.example.com. Append /system to the end of the destination URL to tell libvirt that you need full access.
    # virsh migrate --live guest1-rhel7-64 qemu+ssh://host2.example.com/system
    Once the command is entered you will be prompted for the root password of the destination system.
  3. Wait

    The migration may take some time depending on load and the size of the guest virtual machine. virsh only reports errors. The guest virtual machine continues to run on the source host physical machine until fully migrated.
  4. Verify the guest virtual machine has arrived at the destination host

    From the destination system, host2.example.com, verify guest1-rhel7-64 is running:
    [root@host2 ~]# virsh list
    Id Name                 State
    ----------------------------------
     10 guest1-rhel7-64     running
    
The live migration is now complete.

Note

libvirt supports a variety of networking methods including TLS/SSL, UNIX sockets, SSH, and unencrypted TCP. For more information on using other methods, see Chapter 18, Remote Management of Guests.

Note

Non-running guest virtual machines can be migrated using the following command:
# virsh migrate --offline --persistent 

15.5.1. Additional Tips for Migration with virsh

It is possible to perform multiple, concurrent live migrations where each migration runs in a separate command shell. However, this should be done with caution and should involve careful calculations as each migration instance uses one MAX_CLIENT from each side (source and target). As the default setting is 20, there is enough to run 10 instances without changing the settings. Should you need to change the settings, see the procedure Procedure 15.1, “Configuring libvirtd.conf”.
  1. Open the libvirtd.conf file as described in Procedure 15.1, “Configuring libvirtd.conf”.
  2. Look for the Processing controls section.
    #################################################################
    #
    # Processing controls
    #
    
    # The maximum number of concurrent client connections to allow
    # over all sockets combined.
    #max_clients = 5000
    
    # The maximum length of queue of connections waiting to be
    # accepted by the daemon. Note, that some protocols supporting
    # retransmission may obey this so that a later reattempt at
    # connection succeeds.
    #max_queued_clients = 1000
    
    # The minimum limit sets the number of workers to start up
    # initially. If the number of active clients exceeds this,
    # then more threads are spawned, upto max_workers limit.
    # Typically you'd want max_workers to equal maximum number
    # of clients allowed
    #min_workers = 5
    #max_workers = 20
    
    
    # The number of priority workers. If all workers from above
    # pool will stuck, some calls marked as high priority
    # (notably domainDestroy) can be executed in this pool.
    #prio_workers = 5
    
    # Total global limit on concurrent RPC calls. Should be
    # at least as large as max_workers. Beyond this, RPC requests
    # will be read into memory and queued. This directly impact
    # memory usage, currently each request requires 256 KB of
    # memory. So by default upto 5 MB of memory is used
    #
    # XXX this isn't actually enforced yet, only the per-client
    # limit is used so far
    #max_requests = 20
    
    # Limit on concurrent requests from a single client
    # connection. To avoid one client monopolizing the server
    # this should be a small fraction of the global max_requests
    # and max_workers parameter
    #max_client_requests = 5
    
    #################################################################
    
  3. Change the max_clients and max_workers parameters settings. It is recommended that the number be the same in both parameters. The max_clients will use 2 clients per migration (one per side) and max_workers will use 1 worker on the source and 0 workers on the destination during the perform phase and 1 worker on the destination during the finish phase.

    Important

    The max_clients and max_workers parameters settings are affected by all guest virtual machine connections to the libvirtd service. This means that any user that is using the same guest virtual machine and is performing a migration at the same time will also beholden to the limits set in the max_clients and max_workers parameters settings. This is why the maximum value needs to be considered carefully before performing a concurrent live migration.

    Important

    The max_clients parameter controls how many clients are allowed to connect to libvirt. When a large number of containers are started at once, this limit can be easily reached and exceeded. The value of the max_clients parameter could be increased to avoid this, but doing so can leave the system more vulnerable to denial of service (DoS) attacks against instances. To alleviate this problem, a new max_anonymous_clients setting has been introduced in Red Hat Enterprise Linux 7.0 that specifies a limit of connections which are accepted but not yet authenticated. You can implement a combination of max_clients and max_anonymous_clients to suit your workload.
  4. Save the file and restart the service.

    Note

    There may be cases where a migration connection drops because there are too many ssh sessions that have been started, but not yet authenticated. By default, sshd allows only 10 sessions to be in a "pre-authenticated state" at any time. This setting is controlled by the MaxStartups parameter in the sshd configuration file (located here: /etc/ssh/sshd_config), which may require some adjustment. Adjusting this parameter should be done with caution as the limitation is put in place to prevent DoS attacks (and over-use of resources in general). Setting this value too high will negate its purpose. To change this parameter, edit the file /etc/ssh/sshd_config, remove the # from the beginning of the MaxStartups line, and change the 10 (default value) to a higher number. Remember to save the file and restart the sshd service. For more information, see the sshd_config man page.

15.5.2. Additional Options for the virsh migrate Command

In addition to --live, virsh migrate accepts the following options:
  • --direct - used for direct migration
  • --p2p - used for peer-to-peer migration
  • --tunneled - used for tunneled migration
  • --offline - migrates domain definition without starting the domain on destination and without stopping it on source host. Offline migration may be used with inactive domains and it must be used with the --persistent option.
  • --persistent - leaves the domain persistent on destination host physical machine
  • --undefinesource - undefines the domain on the source host physical machine
  • --suspend - leaves the domain paused on the destination host physical machine
  • --change-protection - enforces that no incompatible configuration changes will be made to the domain while the migration is underway; this flag is implicitly enabled when supported by the hypervisor, but can be explicitly used to reject the migration if the hypervisor lacks change protection support.
  • --unsafe - forces the migration to occur, ignoring all safety procedures.
  • --verbose - displays the progress of migration as it is occurring
  • --compressed - activates compression of memory pages that have to be transferred repeatedly during live migration.
  • --abort-on-error - cancels the migration if a soft error (for example I/O error) happens during the migration.
  • --domain [name] - sets the domain name, id or uuid.
  • --desturi [URI] - connection URI of the destination host as seen from the client (normal migration) or source (p2p migration).
  • --migrateuri [URI] - the migration URI, which can usually be omitted.
  • --graphicsuri [URI] - graphics URI to be used for seamless graphics migration.
  • --listen-address [address] - sets the listen address that the hypervisor on the destination side should bind to for incoming migration.
  • --timeout [seconds] - forces a guest virtual machine to suspend when the live migration counter exceeds N seconds. It can only be used with a live migration. Once the timeout is initiated, the migration continues on the suspended guest virtual machine.
  • --dname [newname] - is used for renaming the domain during migration, which also usually can be omitted
  • --xml [filename] - the filename indicated can be used to supply an alternative XML file for use on the destination to supply a larger set of changes to any host-specific portions of the domain XML, such as accounting for naming differences between source and destination in accessing underlying storage. This option is usually omitted.
  • --migrate-disks [disk_identifiers] - this option can be used to select which disks are copied during the migration. This allows for more efficient live migration when copying certain disks is undesirable, such as when they already exist on the destination, or when they are no longer useful. [disk_identifiers] should be replaced by a comma-separated list of disks to be migrated, identified by their arguments found in the <target dev= /> line of the Domain XML file.
In addition, the following commands may help as well:
  • virsh migrate-setmaxdowntime [domain] [downtime] - will set a maximum tolerable downtime for a domain which is being live-migrated to another host. The specified downtime is in milliseconds. The domain specified must be the same domain that is being migrated.
  • virsh migrate-compcache [domain] --size - will set and or get the size of the cache in bytes which is used for compressing repeatedly transferred memory pages during a live migration. When the --size is not used the command displays the current size of the compression cache. When --size is used, and specified in bytes, the hypervisor is asked to change compression to match the indicated size, following which the current size is displayed. The --size argument is supposed to be used while the domain is being live migrated as a reaction to the migration progress and increasing number of compression cache misses obtained from the domjobinfo.
  • virsh migrate-setspeed [domain] [bandwidth] - sets the migration bandwidth in Mib/sec for the specified domain which is being migrated to another host.
  • virsh migrate-getspeed [domain] - gets the maximum migration bandwidth that is available in Mib/sec for the specified domain.
For more information, see Migration Limitations or the virsh man page.

15.6. Migrating with virt-manager

This section covers migrating a KVM guest virtual machine with virt-manager from one host physical machine to another.
  1. Connect to the target host physical machine

    In the virt-manager interface, connect to the target host physical machine by selecting the File menu, then click Add Connection.
  2. Add connection

    The Add Connection window appears.
    Adding a connection to the target host physical machine

    Figure 15.1. Adding a connection to the target host physical machine

    Enter the following details:
    • Hypervisor: Select QEMU/KVM.
    • Method: Select the connection method.
    • Username: Enter the user name for the remote host physical machine.
    • Hostname: Enter the host name for the remote host physical machine.

    Note

    For more information on the connection options, see Section 19.5, “Adding a Remote Connection”.
    Click Connect. An SSH connection is used in this example, so the specified user's password must be entered in the next step.
    Enter password

    Figure 15.2. Enter password

  3. Configure shared storage

    Ensure that both the source and the target host are sharing storage, for example using NFS.
  4. Migrate guest virtual machines

    Right-click the guest that is to be migrated, and click Migrate.
    In the New Host field, use the drop-down list to select the host physical machine you wish to migrate the guest virtual machine to and click Migrate.
    Choosing the destination host physical machine and starting the migration process

    Figure 15.3. Choosing the destination host physical machine and starting the migration process

    A progress window appears.
    Progress window

    Figure 15.4. Progress window

    If the migration finishes without any problems, virt-manager displays the newly migrated guest virtual machine running in the destination host.
    Migrated guest virtual machine running in the destination host physical machine

    Figure 15.5. Migrated guest virtual machine running in the destination host physical machine

Chapter 16. Guest Virtual Machine Device Configuration

Red Hat Enterprise Linux 7 supports three classes of devices for guest virtual machines:
  • Emulated devices are purely virtual devices that mimic real hardware, allowing unmodified guest operating systems to work with them using their standard in-box drivers.
  • Virtio devices (also known as paravirtualized) are purely virtual devices designed to work optimally in a virtual machine. Virtio devices are similar to emulated devices, but non-Linux virtual machines do not include the drivers they require by default. Virtualization management software like the Virtual Machine Manager (virt-manager) and the Red Hat Virtualization Hypervisor install these drivers automatically for supported non-Linux guest operating systems. Red Hat Enterprise Linux 7 supports up to 216 virtio devices. For more information, see Chapter 5, KVM Paravirtualized (virtio) Drivers.
  • Assigned devices are physical devices that are exposed to the virtual machine. This method is also known as passthrough. Device assignment allows virtual machines exclusive access to PCI devices for a range of tasks, and allows PCI devices to appear and behave as if they were physically attached to the guest operating system. Red Hat Enterprise Linux 7 supports up to 32 assigned devices per virtual machine.
    Device assignment is supported on PCIe devices, including select graphics devices. Parallel PCI devices may be supported as assigned devices, but they have severe limitations due to security and system configuration conflicts.
Red Hat Enterprise Linux 7 supports PCI hot plug of devices exposed as single-function slots to the virtual machine. Single-function host devices and individual functions of multi-function host devices may be configured to enable this. Configurations exposing devices as multi-function PCI slots to the virtual machine are recommended only for non-hotplug applications.
For more information on specific devices and related limitations, see Section 23.18, “Devices”.

Note

Platform support for interrupt remapping is required to fully isolate a guest with assigned devices from the host. Without such support, the host may be vulnerable to interrupt injection attacks from a malicious guest. In an environment where guests are trusted, the admin may opt-in to still allow PCI device assignment using the allow_unsafe_interrupts option to the vfio_iommu_type1 module. This may either be done persistently by adding a .conf file (for example local.conf) to /etc/modprobe.d containing the following:
options vfio_iommu_type1 allow_unsafe_interrupts=1
or dynamically using the sysfs entry to do the same:
# echo 1 > /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts

16.1. PCI Devices

PCI device assignment is only available on hardware platforms supporting either Intel VT-d or AMD IOMMU. These Intel VT-d or AMD IOMMU specifications must be enabled in the host BIOS for PCI device assignment to function.

Procedure 16.1. Preparing an Intel system for PCI device assignment

  1. Enable the Intel VT-d specifications

    The Intel VT-d specifications provide hardware support for directly assigning a physical device to a virtual machine. These specifications are required to use PCI device assignment with Red Hat Enterprise Linux.
    The Intel VT-d specifications must be enabled in the BIOS. Some system manufacturers disable these specifications by default. The terms used to see these specifications can differ between manufacturers; consult your system manufacturer's documentation for the appropriate terms.
  2. Activate Intel VT-d in the kernel

    Activate Intel VT-d in the kernel by adding the intel_iommu=on and iommu=pt parameters to the end of the GRUB_CMDLINX_LINUX line, within the quotes, in the /etc/sysconfig/grub file.
    The example below is a modified grub file with Intel VT-d activated.
    GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
    vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
    vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/
    rhcrashkernel-param || :) rhgb quiet intel_iommu=on iommu=pt"
  3. Regenerate config file

    Regenerate /etc/grub2.cfg by running:
    grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  4. Ready to use

    Reboot the system to enable the changes. Your system is now capable of PCI device assignment.

Procedure 16.2. Preparing an AMD system for PCI device assignment

  1. Enable the AMD IOMMU specifications

    The AMD IOMMU specifications are required to use PCI device assignment in Red Hat Enterprise Linux. These specifications must be enabled in the BIOS. Some system manufacturers disable these specifications by default.
  2. Enable IOMMU kernel support

    Append amd_iommu=pt to the end of the GRUB_CMDLINX_LINUX line, within the quotes, in /etc/sysconfig/grub so that AMD IOMMU specifications are enabled at boot.
  3. Regenerate config file

    Regenerate /etc/grub2.cfg by running:
    grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  4. Ready to use

    Reboot the system to enable the changes. Your system is now capable of PCI device assignment.

Note

For further information on IOMMU, see Appendix E, Working with IOMMU Groups.

16.1.1. Assigning a PCI Device with virsh

These steps cover assigning a PCI device to a virtual machine on a KVM hypervisor.
This example uses a PCIe network controller with the PCI identifier code, pci_0000_01_00_0, and a fully virtualized guest machine named guest1-rhel7-64.

Procedure 16.3. Assigning a PCI device to a guest virtual machine with virsh

  1. Identify the device

    First, identify the PCI device designated for device assignment to the virtual machine. Use the lspci command to list the available PCI devices. You can refine the output of lspci with grep.
    This example uses the Ethernet controller highlighted in the following output:
    # lspci | grep Ethernet
    00:19.0 Ethernet controller: Intel Corporation 82567LM-2 Gigabit Network Connection
    01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    This Ethernet controller is shown with the short identifier 00:19.0. We need to find out the full identifier used by virsh in order to assign this PCI device to a virtual machine.
    To do so, use the virsh nodedev-list command to list all devices of a particular type (pci) that are attached to the host machine. Then look at the output for the string that maps to the short identifier of the device you wish to use.
    This example shows the string that maps to the Ethernet controller with the short identifier 00:19.0. Note that the : and . characters are replaced with underscores in the full identifier.
    # virsh nodedev-list --cap pci
    pci_0000_00_00_0
    pci_0000_00_01_0
    pci_0000_00_03_0
    pci_0000_00_07_0
    pci_0000_00_10_0
    pci_0000_00_10_1
    pci_0000_00_14_0
    pci_0000_00_14_1
    pci_0000_00_14_2
    pci_0000_00_14_3
    pci_0000_00_19_0
    pci_0000_00_1a_0
    pci_0000_00_1a_1
    pci_0000_00_1a_2
    pci_0000_00_1a_7
    pci_0000_00_1b_0
    pci_0000_00_1c_0
    pci_0000_00_1c_1
    pci_0000_00_1c_4
    pci_0000_00_1d_0
    pci_0000_00_1d_1
    pci_0000_00_1d_2
    pci_0000_00_1d_7
    pci_0000_00_1e_0
    pci_0000_00_1f_0
    pci_0000_00_1f_2
    pci_0000_00_1f_3
    pci_0000_01_00_0
    pci_0000_01_00_1
    pci_0000_02_00_0
    pci_0000_02_00_1
    pci_0000_06_00_0
    pci_0000_07_02_0
    pci_0000_07_03_0
    Record the PCI device number that maps to the device you want to use; this is required in other steps.
  2. Review device information

    Information on the domain, bus, and function are available from output of the virsh nodedev-dumpxml command:
    
    # virsh nodedev-dumpxml pci_0000_00_19_0
    <device>
      <name>pci_0000_00_19_0</name>
      <parent>computer</parent>
      <driver>
        <name>e1000e</name>
      </driver>
      <capability type='pci'>
        <domain>0</domain>
        <bus>0</bus>
        <slot>25</slot>
        <function>0</function>
        <product id='0x1502'>82579LM Gigabit Network Connection</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
        <iommuGroup number='7'>
          <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
        </iommuGroup>
      </capability>
    </device>
    

    Figure 16.1. Dump contents

    Note

    An IOMMU group is determined based on the visibility and isolation of devices from the perspective of the IOMMU. Each IOMMU group may contain one or more devices. When multiple devices are present, all endpoints within the IOMMU group must be claimed for any device within the group to be assigned to a guest. This can be accomplished either by also assigning the extra endpoints to the guest or by detaching them from the host driver using virsh nodedev-detach. Devices contained within a single group may not be split between multiple guests or split between host and guest. Non-endpoint devices such as PCIe root ports, switch ports, and bridges should not be detached from the host drivers and will not interfere with assignment of endpoints.
    Devices within an IOMMU group can be determined using the iommuGroup section of the virsh nodedev-dumpxml output. Each member of the group is provided via a separate "address" field. This information may also be found in sysfs using the following:
    $ ls /sys/bus/pci/devices/0000:01:00.0/iommu_group/devices/
    An example of the output from this would be:
    0000:01:00.0  0000:01:00.1
    To assign only 0000.01.00.0 to the guest, the unused endpoint should be detached from the host before starting the guest:
    $ virsh nodedev-detach pci_0000_01_00_1
  3. Determine required configuration details

    See the output from the virsh nodedev-dumpxml pci_0000_00_19_0 command for the values required for the configuration file.
    The example device has the following values: bus = 0, slot = 25 and function = 0. The decimal configuration uses those three values:
    bus='0'
    slot='25'
    function='0'
  4. Add configuration details

    Run virsh edit, specifying the virtual machine name, and add a device entry in the <source> section to assign the PCI device to the guest virtual machine.
    # virsh edit guest1-rhel7-64
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
         <address domain='0' bus='0' slot='25' function='0'/>
      </source>
    </hostdev>
    

    Figure 16.2. Add PCI device

    Alternately, run virsh attach-device, specifying the virtual machine name and the guest's XML file:
    virsh attach-device guest1-rhel7-64 file.xml
  5. Start the virtual machine

    # virsh start guest1-rhel7-64
The PCI device should now be successfully assigned to the virtual machine, and accessible to the guest operating system.

16.1.2. Assigning a PCI Device with virt-manager

PCI devices can be added to guest virtual machines using the graphical virt-manager tool. The following procedure adds a Gigabit Ethernet controller to a guest virtual machine.

Procedure 16.4. Assigning a PCI device to a guest virtual machine using virt-manager

  1. Open the hardware settings

    Open the guest virtual machine and click the Add Hardware button to add a new device to the virtual machine.
    The virtual machine hardware window with the Information button selected on the top taskbar and Overview selected on the left menu pane.

    Figure 16.3. The virtual machine hardware information window

  2. Select a PCI device

    Select PCI Host Device from the Hardware list on the left.
    Select an unused PCI device. Note that selecting PCI devices presently in use by another guest causes errors. In this example, a spare audio controller is used. Click Finish to complete setup.
    The Add new virtual hardware wizard with PCI Host Device selected on the left menu pane, showing a list of host devices for selection in the right menu pane.

    Figure 16.4. The Add new virtual hardware wizard

  3. Add the new device

    The setup is complete and the guest virtual machine now has direct access to the PCI device.
    The virtual machine hardware window with the Information button selected on the top taskbar and Overview selected on the left menu pane, displaying the newly added PCI Device in the list of virtual machine devices in the left menu pane.

    Figure 16.5. The virtual machine hardware information window

Note

If device assignment fails, there may be other endpoints in the same IOMMU group that are still attached to the host. There is no way to retrieve group information using virt-manager, but virsh commands can be used to analyze the bounds of the IOMMU group and if necessary sequester devices.
See the Note in Section 16.1.1, “Assigning a PCI Device with virsh” for more information on IOMMU groups and how to detach endpoint devices using virsh.

16.1.3. PCI Device Assignment with virt-install

It is possible to assign a PCI device when installing a guest using the virt-install command. To do this, use the --host-device parameter.

Procedure 16.5. Assigning a PCI device to a virtual machine with virt-install

  1. Identify the device

    Identify the PCI device designated for device assignment to the guest virtual machine.
    # lspci | grep Ethernet
    00:19.0 Ethernet controller: Intel Corporation 82567LM-2 Gigabit Network Connection
    01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    The virsh nodedev-list command lists all devices attached to the system, and identifies each PCI device with a string. To limit output to only PCI devices, enter the following command:
    # virsh nodedev-list --cap pci
    pci_0000_00_00_0
    pci_0000_00_01_0
    pci_0000_00_03_0
    pci_0000_00_07_0
    pci_0000_00_10_0
    pci_0000_00_10_1
    pci_0000_00_14_0
    pci_0000_00_14_1
    pci_0000_00_14_2
    pci_0000_00_14_3
    pci_0000_00_19_0
    pci_0000_00_1a_0
    pci_0000_00_1a_1
    pci_0000_00_1a_2
    pci_0000_00_1a_7
    pci_0000_00_1b_0
    pci_0000_00_1c_0
    pci_0000_00_1c_1
    pci_0000_00_1c_4
    pci_0000_00_1d_0
    pci_0000_00_1d_1
    pci_0000_00_1d_2
    pci_0000_00_1d_7
    pci_0000_00_1e_0
    pci_0000_00_1f_0
    pci_0000_00_1f_2
    pci_0000_00_1f_3
    pci_0000_01_00_0
    pci_0000_01_00_1
    pci_0000_02_00_0
    pci_0000_02_00_1
    pci_0000_06_00_0
    pci_0000_07_02_0
    pci_0000_07_03_0
    Record the PCI device number; the number is needed in other steps.
    Information on the domain, bus and function are available from output of the virsh nodedev-dumpxml command:
    # virsh nodedev-dumpxml pci_0000_01_00_0
    
    <device>
      <name>pci_0000_01_00_0</name>
      <parent>pci_0000_00_01_0</parent>
      <driver>
        <name>igb</name>
      </driver>
      <capability type='pci'>
        <domain>0</domain>
        <bus>1</bus>
        <slot>0</slot>
        <function>0</function>
        <product id='0x10c9'>82576 Gigabit Network Connection</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
        <iommuGroup number='7'>
          <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
        </iommuGroup>
      </capability>
    </device>
    

    Figure 16.6. PCI device file contents

    Note

    If there are multiple endpoints in the IOMMU group and not all of them are assigned to the guest, you will need to manually detach the other endpoint(s) from the host by running the following command before you start the guest:
    $ virsh nodedev-detach pci_0000_00_19_1
    See the Note in Section 16.1.1, “Assigning a PCI Device with virsh” for more information on IOMMU groups.
  2. Add the device

    Use the PCI identifier output from the virsh nodedev command as the value for the --host-device parameter.
    virt-install \
    --name=guest1-rhel7-64 \
    --disk path=/var/lib/libvirt/images/guest1-rhel7-64.img,size=8 \
    --vcpus=2 --ram=2048 \
    --location=http://example1.com/installation_tree/RHEL7.0-Server-x86_64/os \
    --nonetworks \
    --os-type=linux \
    --os-variant=rhel7
    --host-device=pci_0000_01_00_0
  3. Complete the installation

    Complete the guest installation. The PCI device should be attached to the guest.

16.1.4. Detaching an Assigned PCI Device

When a host PCI device has been assigned to a guest machine, the host can no longer use the device. If the PCI device is in managed mode (configured using the managed='yes' parameter in the domain XML file), it attaches to the guest machine and detaches from the guest machine and re-attaches to the host machine as necessary. If the PCI device is not in managed mode, you can detach the PCI device from the guest machine and re-attach it using virsh or virt-manager.

Procedure 16.6. Detaching a PCI device from a guest with virsh

  1. Detach the device

    Use the following command to detach the PCI device from the guest by removing it in the guest's XML file:
    # virsh detach-device name_of_guest file.xml
  2. Re-attach the device to the host (optional)

    If the device is in managed mode, skip this step. The device will be returned to the host automatically.
    If the device is not using managed mode, use the following command to re-attach the PCI device to the host machine:
    # virsh nodedev-reattach device
    For example, to re-attach the pci_0000_01_00_0 device to the host:
    # virsh nodedev-reattach pci_0000_01_00_0
    The device is now available for host use.

Procedure 16.7. Detaching a PCI Device from a guest with virt-manager

  1. Open the virtual hardware details screen

    In virt-manager, double-click the virtual machine that contains the device. Select the Show virtual hardware details button to display a list of virtual hardware.
    The Show virtual hardware details button.

    Figure 16.7. The virtual hardware details button

  2. Select and remove the device

    Select the PCI device to be detached from the list of virtual devices in the left panel.
    The PCI device details and the Remove button.

    Figure 16.8. Selecting the PCI device to be detached

    Click the Remove button to confirm. The device is now available for host use.

16.1.5. Creating PCI Bridges

Peripheral Component Interconnects (PCI) bridges are used to attach to devices such as network cards, modems and sound cards. Just like their physical counterparts, virtual devices can also be attached to a PCI Bridge. In the past, only 31 PCI devices could be added to any guest virtual machine. Now, when a 31st PCI device is added, a PCI bridge is automatically placed in the 31st slot moving the additional PCI device to the PCI bridge. Each PCI bridge has 31 slots for 31 additional devices, all of which can be bridges. In this manner, over 900 devices can be available for guest virtual machines. Note that this action cannot be performed when the guest virtual machine is running. You must add the PCI device on a guest virtual machine that is shutdown.

16.1.5.1. PCI Bridge hot plug/hot unplug Support

PCI Bridge hot plug/hot unplug is supported on the following device types:
  • virtio-net-pci
  • virtio-scsi-pci
  • e1000
  • rtl8139
  • virtio-serial-pci
  • virtio-balloon-pci

16.1.6. PCI Device Assignment Restrictions

PCI device assignment (attaching PCI devices to virtual machines) requires host systems to have AMD IOMMU or Intel VT-d support to enable device assignment of PCIe devices.
Red Hat Enterprise Linux 7 has limited PCI configuration space access by guest device drivers. This limitation could cause drivers that are dependent on device capabilities or features present in the extended PCI configuration space, to fail configuration.
There is a limit of 32 total assigned devices per Red Hat Enterprise Linux 7 virtual machine. This translates to 32 total PCI functions, regardless of the number of PCI bridges present in the virtual machine or how those functions are combined to create multi-function slots.
Platform support for interrupt remapping is required to fully isolate a guest with assigned devices from the host. Without such support, the host may be vulnerable to interrupt injection attacks from a malicious guest. In an environment where guests are trusted, the administrator may opt-in to still allow PCI device assignment using the allow_unsafe_interrupts option to the vfio_iommu_type1 module. This may either be done persistently by adding a .conf file (for example local.conf) to /etc/modprobe.d containing the following:
options vfio_iommu_type1 allow_unsafe_interrupts=1
or dynamically using the sysfs entry to do the same:
# echo 1 > /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts

16.2. PCI Device Assignment with SR-IOV Devices

A PCI network device (specified in the domain XML by the <source> element) can be directly connected to the guest using direct device assignment (sometimes referred to as passthrough). Due to limitations in standard single-port PCI ethernet card driver design, only Single Root I/O Virtualization (SR-IOV) virtual function (VF) devices can be assigned in this manner; to assign a standard single-port PCI or PCIe Ethernet card to a guest, use the traditional <hostdev> device definition.

     <devices>
    <interface type='hostdev'>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
      </source>
      <mac address='52:54:00:6d:90:02'>
      <virtualport type='802.1Qbh'>
        <parameters profileid='finance'/>
      </virtualport>
    </interface>
  </devices>

Figure 16.9. XML example for PCI device assignment

Developed by the PCI-SIG (PCI Special Interest Group), the Single Root I/O Virtualization (SR-IOV) specification is a standard for a type of PCI device assignment that can share a single device to multiple virtual machines. SR-IOV improves device performance for virtual machines.
How SR-IOV works

Figure 16.10. How SR-IOV works

SR-IOV enables a Single Root Function (for example, a single Ethernet port), to appear as multiple, separate, physical devices. A physical device with SR-IOV capabilities can be configured to appear in the PCI configuration space as multiple functions. Each device has its own configuration space complete with Base Address Registers (BARs).
SR-IOV uses two PCI functions:
  • Physical Functions (PFs) are full PCIe devices that include the SR-IOV capabilities. Physical Functions are discovered, managed, and configured as normal PCI devices. Physical Functions configure and manage the SR-IOV functionality by assigning Virtual Functions.
  • Virtual Functions (VFs) are simple PCIe functions that only process I/O. Each Virtual Function is derived from a Physical Function. The number of Virtual Functions a device may have is limited by the device hardware. A single Ethernet port, the Physical Device, may map to many Virtual Functions that can be shared to virtual machines.
The hypervisor can assign one or more Virtual Functions to a virtual machine. The Virtual Function's configuration space is then assigned to the configuration space presented to the guest.
Each Virtual Function can only be assigned to a single guest at a time, as Virtual Functions require real hardware resources. A virtual machine can have multiple Virtual Functions. A Virtual Function appears as a network card in the same way as a normal network card would appear to an operating system.
The SR-IOV drivers are implemented in the kernel. The core implementation is contained in the PCI subsystem, but there must also be driver support for both the Physical Function (PF) and Virtual Function (VF) devices. An SR-IOV capable device can allocate VFs from a PF. The VFs appear as PCI devices which are backed on the physical PCI device by resources such as queues and register sets.

16.2.1. Advantages of SR-IOV

SR-IOV devices can share a single physical port with multiple virtual machines.
When an SR-IOV VF is assigned to a virtual machine, it can be configured to (transparently to the virtual machine) place all network traffic leaving the VF onto a particular VLAN. The virtual machine cannot detect that its traffic is being tagged for a VLAN, and will be unable to change or eliminate this tagging.
Virtual Functions have near-native performance and provide better performance than paravirtualized drivers and emulated access. Virtual Functions provide data protection between virtual machines on the same physical server as the data is managed and controlled by the hardware.
These features allow for increased virtual machine density on hosts within a data center.
SR-IOV is better able to utilize the bandwidth of devices with multiple guests.

16.2.2. Using SR-IOV

This section covers the use of PCI passthrough to assign a Virtual Function of an SR-IOV capable multiport network card to a virtual machine as a network device.
SR-IOV Virtual Functions (VFs) can be assigned to virtual machines by adding a device entry in <hostdev> with the virsh edit or virsh attach-device command. However, this can be problematic because unlike a regular network device, an SR-IOV VF network device does not have a permanent unique MAC address, and is assigned a new MAC address each time the host is rebooted. Because of this, even if the guest is assigned the same VF after a reboot, when the host is rebooted the guest determines its network adapter to have a new MAC address. As a result, the guest believes there is new hardware connected each time, and will usually require re-configuration of the guest's network settings.
libvirt contains the <interface type='hostdev'> interface device. Using this interface device, libvirt will first perform any network-specific hardware/switch initialization indicated (such as setting the MAC address, VLAN tag, or 802.1Qbh virtualport parameters), then perform the PCI device assignment to the guest.
Using the <interface type='hostdev'> interface device requires:
  • an SR-IOV-capable network card,
  • host hardware that supports either the Intel VT-d or the AMD IOMMU extensions
  • the PCI address of the VF to be assigned.

Important

Assignment of an SR-IOV device to a virtual machine requires that the host hardware supports the Intel VT-d or the AMD IOMMU specification.
To attach an SR-IOV network device on an Intel or an AMD system, follow this procedure:

Procedure 16.8. Attach an SR-IOV network device on an Intel or AMD system

  1. Enable Intel VT-d or the AMD IOMMU specifications in the BIOS and kernel

    On an Intel system, enable Intel VT-d in the BIOS if it is not enabled already. See Procedure 16.1, “Preparing an Intel system for PCI device assignment” for procedural help on enabling Intel VT-d in the BIOS and kernel.
    Skip this step if Intel VT-d is already enabled and working.
    On an AMD system, enable the AMD IOMMU specifications in the BIOS if they are not enabled already. See Procedure 16.2, “Preparing an AMD system for PCI device assignment” for procedural help on enabling IOMMU in the BIOS.
  2. Verify support

    Verify if the PCI device with SR-IOV capabilities is detected. This example lists an Intel 82576 network interface card which supports SR-IOV. Use the lspci command to verify whether the device was detected.
    # lspci
    03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    Note that the output has been modified to remove all other devices.
  3. Activate Virtual Functions

    Run the following command:
    # echo ${num_vfs} > /sys/class/net/enp14s0f0/device/sriov_numvfs
  4. Make the Virtual Functions persistent

    To make the Virtual Functions persistent across reboots, use the editor of your choice to create an udev rule similar to the following, where you specify the intended number of VFs (in this example, 2), up to the limit supported by the network interface card. In the following example, replace enp14s0f0 with the PF network device name(s) and adjust the value of ENV{ID_NET_DRIVER} to match the driver in use:
    # vim /etc/udev/rules.d/enp14s0f0.rules
    ACTION=="add", SUBSYSTEM=="net", ENV{ID_NET_DRIVER}=="ixgbe",
    ATTR{device/sriov_numvfs}="2"
    
    This will ensure the feature is enabled at boot-time.
  5. Inspect the new Virtual Functions

    Using the lspci command, list the newly added Virtual Functions attached to the Intel 82576 network device. (Alternatively, use grep to search for Virtual Function, to search for devices that support Virtual Functions.)
    # lspci | grep 82576
    0b:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    0b:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
    0b:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:10.4 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:10.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:10.6 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:10.7 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:11.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:11.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:11.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:11.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:11.4 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    0b:11.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
    The identifier for the PCI device is found with the -n parameter of the lspci command. The Physical Functions correspond to 0b:00.0 and 0b:00.1. All Virtual Functions have Virtual Function in the description.
  6. Verify devices exist with virsh

    The libvirt service must recognize the device before adding a device to a virtual machine. libvirt uses a similar notation to the lspci output. All punctuation characters, : and ., in lspci output are changed to underscores (_).
    Use the virsh nodedev-list command and the grep command to filter the Intel 82576 network device from the list of available host devices. 0b is the filter for the Intel 82576 network devices in this example. This may vary for your system and may result in additional devices.
    # virsh nodedev-list | grep 0b
    pci_0000_0b_00_0
    pci_0000_0b_00_1
    pci_0000_0b_10_0
    pci_0000_0b_10_1
    pci_0000_0b_10_2
    pci_0000_0b_10_3
    pci_0000_0b_10_4
    pci_0000_0b_10_5
    pci_0000_0b_10_6
    pci_0000_0b_11_7
    pci_0000_0b_11_1
    pci_0000_0b_11_2
    pci_0000_0b_11_3
    pci_0000_0b_11_4
    pci_0000_0b_11_5
    The PCI addresses for the Virtual Functions and Physical Functions should be in the list.
  7. Get device details with virsh

    The pci_0000_0b_00_0 is one of the Physical Functions and pci_0000_0b_10_0 is the first corresponding Virtual Function for that Physical Function. Use the virsh nodedev-dumpxml command to get device details for both devices.
    # virsh nodedev-dumpxml pci_0000_03_00_0
    <device>
      <name>pci_0000_03_00_0</name>
      <path>/sys/devices/pci0000:00/0000:00:01.0/0000:03:00.0</path>
      <parent>pci_0000_00_01_0</parent>
      <driver>
        <name>igb</name>
      </driver>
      <capability type='pci'>
        <domain>0</domain>
        <bus>3</bus>
        <slot>0</slot>
        <function>0</function>
        <product id='0x10c9'>82576 Gigabit Network Connection</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
        <capability type='virt_functions'>
          <address domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
          <address domain='0x0000' bus='0x03' slot='0x10' function='0x2'/>
          <address domain='0x0000' bus='0x03' slot='0x10' function='0x4'/>
          <address domain='0x0000' bus='0x03' slot='0x10' function='0x6'/>
          <address domain='0x0000' bus='0x03' slot='0x11' function='0x0'/>
          <address domain='0x0000' bus='0x03' slot='0x11' function='0x2'/>
          <address domain='0x0000' bus='0x03' slot='0x11' function='0x4'/>
        </capability>
        <iommuGroup number='14'>
          <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
          <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
        </iommuGroup>
      </capability>
    </device>
    # virsh nodedev-dumpxml pci_0000_03_11_5
    <device>
      <name>pci_0000_03_11_5</name>
      <path>/sys/devices/pci0000:00/0000:00:01.0/0000:03:11.5</path>
      <parent>pci_0000_00_01_0</parent>
      <driver>
        <name>igbvf</name>
      </driver>
      <capability type='pci'>
        <domain>0</domain>
        <bus>3</bus>
        <slot>17</slot>
        <function>5</function>
        <product id='0x10ca'>82576 Virtual Function</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
        <capability type='phys_function'>
          <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
        </capability>
        <iommuGroup number='35'>
          <address domain='0x0000' bus='0x03' slot='0x11' function='0x5'/>
        </iommuGroup>
      </capability>
    </device>
    This example adds the Virtual Function pci_0000_03_10_2 to the virtual machine in Step 8. Note the bus, slot and function parameters of the Virtual Function: these are required for adding the device.
    Copy these parameters into a temporary XML file, such as /tmp/new-interface.xml for example.
       <interface type='hostdev' managed='yes'>
         <source>
           <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x2'/>
         </source>
       </interface>

    Note

    When the virtual machine starts, it should see a network device of the type provided by the physical adapter, with the configured MAC address. This MAC address will remain unchanged across host and guest reboots.
    The following <interface> example shows the syntax for the optional <mac address>, <virtualport>, and <vlan> elements. In practice, use either the <vlan> or <virtualport> element, not both simultaneously as shown in the example:
    ...
     <devices>
       ...
       <interface type='hostdev' managed='yes'>
         <source>
           <address type='pci' domain='0' bus='11' slot='16' function='0'/>
         </source>
         <mac address='52:54:00:6d:90:02'>
         <vlan>
            <tag id='42'/>
         </vlan>
         <virtualport type='802.1Qbh'>
           <parameters profileid='finance'/>
         </virtualport>
       </interface>
       ...
     </devices>
    If you do not specify a MAC address, one will be automatically generated. The <virtualport> element is only used when connecting to an 802.11Qbh hardware switch. The <vlan> element will transparently put the guest's device on the VLAN tagged 42.
  8. Add the Virtual Function to the virtual machine

    Add the Virtual Function to the virtual machine using the following command with the temporary file created in the previous step. This attaches the new device immediately and saves it for subsequent guest restarts.
    virsh attach-device MyGuest /tmp/new-interface.xml --live --config
    
    Specifying the --live option with virsh attach-device attaches the new device to the running guest. Using the --config option ensures the new device is available after future guest restarts.

    Note

    The --live option is only accepted when the guest is running. virsh will return an error if the --live option is used on a non-running guest.
The virtual machine detects a new network interface card. This new card is the Virtual Function of the SR-IOV device.

16.2.3. Configuring PCI Assignment with SR-IOV Devices

SR-IOV network cards provide multiple VFs that can each be individually assigned to a guest virtual machines using PCI device assignment. Once assigned, each behaves as a full physical network device. This permits many guest virtual machines to gain the performance advantage of direct PCI device assignment, while only using a single slot on the host physical machine.
These VFs can be assigned to guest virtual machines in the traditional manner using the <hostdev> element. However, SR-IOV VF network devices do not have permanent unique MAC addresses, which causes problems where the guest virtual machine's network settings need to be re-configured each time the host physical machine is rebooted. To fix this, you need to set the MAC address prior to assigning the VF to the host physical machine after every boot of the guest virtual machine. In order to assign this MAC address, as well as other options, see the following procedure:

Procedure 16.9. Configuring MAC addresses, vLAN, and virtual ports for assigning PCI devices on SR-IOV

The <hostdev> element cannot be used for function-specific items like MAC address assignment, vLAN tag ID assignment, or virtual port assignment, because the <mac>, <vlan>, and <virtualport> elements are not valid children for <hostdev>. Instead, these elements can be used with the hostdev interface type: <interface type='hostdev'>. This device type behaves as a hybrid of an <interface> and <hostdev>. Thus, before assigning the PCI device to the guest virtual machine, libvirt initializes the network-specific hardware/switch that is indicated (such as setting the MAC address, setting a vLAN tag, or associating with an 802.1Qbh switch) in the guest virtual machine's XML configuration file. For information on setting the vLAN tag, see Section 17.16, “Setting vLAN Tags”.
  1. Gather information

    In order to use <interface type='hostdev'>, you must have an SR-IOV-capable network card, host physical machine hardware that supports either the Intel VT-d or AMD IOMMU extensions, and you must know the PCI address of the VF that you wish to assign.
  2. Shut down the guest virtual machine

    Using virsh shutdown command, shut down the guest virtual machine (here named guestVM).
    # virsh shutdown guestVM
  3. Open the XML file for editing

    Run the virsh save-image-edit command to open the XML file for editing (refer to Section 20.7.5, “Editing the Guest Virtual Machine Configuration” for more information) with the --running option. The name of the configuration file in this example is guestVM.xml.
     # virsh save-image-edit guestVM.xml --running 
    The guestVM.xml opens in your default editor.
  4. Edit the XML file

    Update the configuration file (guestVM.xml) to have a <devices> entry similar to the following:
    
     <devices>
       ...
       <interface type='hostdev' managed='yes'>
         <source>
           <address type='pci' domain='0x0' bus='0x00' slot='0x07' function='0x0'/> <!--these values can be decimal as well-->
         </source>
         <mac address='52:54:00:6d:90:02'/>                                         <!--sets the mac address-->
         <virtualport type='802.1Qbh'>                                              <!--sets the virtual port for the 802.1Qbh switch-->
           <parameters profileid='finance'/>
         </virtualport>
         <vlan>                                                                     <!--sets the vlan tag-->
          <tag id='42'/>
         </vlan>
       </interface>
       ...
     </devices>
    
    

    Figure 16.11. Sample domain XML for hostdev interface type

    Note that if you do not provide a MAC address, one will be automatically generated, just as with any other type of interface device. In addition, the <virtualport> element is only used if you are connecting to an 802.11Qgh hardware switch. 802.11Qbg (also known as "VEPA") switches are currently not supported.
  5. Restart the guest virtual machine

    Run the virsh start command to restart the guest virtual machine you shut down in step 2. See Section 20.6, “Starting, Resuming, and Restoring a Virtual Machine” for more information.
     # virsh start guestVM 
    When the guest virtual machine starts, it sees the network device provided to it by the physical host machine's adapter, with the configured MAC address. This MAC address remains unchanged across guest virtual machine and host physical machine reboots.

16.2.4. Setting PCI device assignment from a pool of SR-IOV virtual functions

Hard coding the PCI addresses of particular Virtual Functions (VFs) into a guest's configuration has two serious limitations:
  • The specified VF must be available any time the guest virtual machine is started. Therefore, the administrator must permanently assign each VF to a single guest virtual machine (or modify the configuration file for every guest virtual machine to specify a currently unused VF's PCI address each time every guest virtual machine is started).
  • If the guest virtual machine is moved to another host physical machine, that host physical machine must have exactly the same hardware in the same location on the PCI bus (or the guest virtual machine configuration must be modified prior to start).
It is possible to avoid both of these problems by creating a libvirt network with a device pool containing all the VFs of an SR-IOV device. Once that is done, configure the guest virtual machine to reference this network. Each time the guest is started, a single VF will be allocated from the pool and assigned to the guest virtual machine. When the guest virtual machine is stopped, the VF will be returned to the pool for use by another guest virtual machine.

Procedure 16.10. Creating a device pool

  1. Shut down the guest virtual machine

    Using virsh shutdown command, shut down the guest virtual machine, here named guestVM.
    # virsh shutdown guestVM
  2. Create a configuration file

    Using your editor of choice, create an XML file (named passthrough.xml, for example) in the /tmp directory. Make sure to replace pf dev='eth3' with the netdev name of your own SR-IOV device's Physical Function (PF).
    The following is an example network definition that will make available a pool of all VFs for the SR-IOV adapter with its PF at "eth3' on the host physical machine:
          
    <network>
       <name>passthrough</name> <!-- This is the name of the file you created -->
       <forward mode='hostdev' managed='yes'>
         <pf dev='myNetDevName'/>  <!-- Use the netdev name of your SR-IOV devices PF here -->
       </forward>
    </network>
          
    
    

    Figure 16.12. Sample network definition domain XML

  3. Load the new XML file

    Enter the following command, replacing /tmp/passthrough.xml with the name and location of your XML file you created in the previous step:
    # virsh net-define /tmp/passthrough.xml
  4. Restarting the guest

    Run the following, replacing passthrough.xml with the name of your XML file you created in the previous step:
     # virsh net-autostart passthrough # virsh net-start passthrough 
  5. Re-start the guest virtual machine

    Run the virsh start command to restart the guest virtual machine you shutdown in the first step (example uses guestVM as the guest virtual machine's domain name). See Section 20.6, “Starting, Resuming, and Restoring a Virtual Machine” for more information.
     # virsh start guestVM 
  6. Initiating passthrough for devices

    Although only a single device is shown, libvirt will automatically derive the list of all VFs associated with that PF the first time a guest virtual machine is started with an interface definition in its domain XML like the following:
             
    <interface type='network'>
       <source network='passthrough'>
    </interface>
          
    
    

    Figure 16.13. Sample domain XML for interface network definition

  7. Verification

    You can verify this by running virsh net-dumpxml passthrough command after starting the first guest that uses the network; you will get output similar to the following:
          
    <network connections='1'>
       <name>passthrough</name>
       <uuid>a6b49429-d353-d7ad-3185-4451cc786437</uuid>
       <forward mode='hostdev' managed='yes'>
         <pf dev='eth3'/>
         <address type='pci' domain='0x0000' bus='0x02' slot='0x10' function='0x1'/>
         <address type='pci' domain='0x0000' bus='0x02' slot='0x10' function='0x3'/>
         <address type='pci' domain='0x0000' bus='0x02' slot='0x10' function='0x5'/>
         <address type='pci' domain='0x0000' bus='0x02' slot='0x10' function='0x7'/>
         <address type='pci' domain='0x0000' bus='0x02' slot='0x11' function='0x1'/>
         <address type='pci' domain='0x0000' bus='0x02' slot='0x11' function='0x3'/>
         <address type='pci' domain='0x0000' bus='0x02' slot='0x11' function='0x5'/>
       </forward>
    </network>
          
    
    

    Figure 16.14. XML dump file passthrough contents

16.2.5. SR-IOV Restrictions

SR-IOV is only thoroughly tested with the following devices:
  • Intel® 82576NS Gigabit Ethernet Controller (igb driver)
  • Intel® 82576EB Gigabit Ethernet Controller (igb driver)
  • Intel® 82599ES 10 Gigabit Ethernet Controller (ixgbe driver)
  • Intel® 82599EB 10 Gigabit Ethernet Controller (ixgbe driver)
Other SR-IOV devices may work but have not been tested at the time of release

16.3. USB Devices

This section gives the commands required for handling USB devices.

16.3.1. Assigning USB Devices to Guest Virtual Machines

Most devices such as web cameras, card readers, disk drives, keyboards, mice are connected to a computer using a USB port and cable. There are two ways to pass such devices to a guest virtual machine:
  • Using USB passthrough - this requires the device to be physically connected to the host physical machine that is hosting the guest virtual machine. SPICE is not needed in this case. USB devices on the host can be passed to the guest via the command line or virt-manager. See Section 19.3.2, “Attaching USB Devices to a Guest Virtual Machine” for virt manager directions. Note that the virt-manager directions are not suitable for hot plugging or hot unplugging devices. If you want to hot plug/or hot unplug a USB device, see Procedure 20.4, “Hot plugging USB devices for use by the guest virtual machine”.
  • Using USB re-direction - USB re-direction is best used in cases where there is a host physical machine that is running in a data center. The user connects to his/her guest virtual machine from a local machine or thin client. On this local machine there is a SPICE client. The user can attach any USB device to the thin client and the SPICE client will redirect the device to the host physical machine on the data center so it can be used by the guest virtual machine that is running on the thin client. For instructions via the virt-manager see Section 19.3.3, “USB Redirection”.

16.3.2. Setting a Limit on USB Device Redirection

To filter out certain devices from redirection, pass the filter property to -device usb-redir. The filter property takes a string consisting of filter rules, the format for a rule is:
<class>:<vendor>:<product>:<version>:<allow>
Use the value -1 to designate it to accept any value for a particular field. You may use multiple rules on the same command line using | as a separator. Note that if a device matches none of the passed in rules, redirecting it will not be allowed!

Example 16.1. An example of limiting redirection with a guest virtual machine

  1. Prepare a guest virtual machine.
  2. Add the following code excerpt to the guest virtual machine's' domain XML file:
        <redirdev bus='usb' type='spicevmc'>
          <alias name='redir0'/>
          <address type='usb' bus='0' port='3'/>
        </redirdev>
        <redirfilter>
          <usbdev class='0x08' vendor='0x1234' product='0xBEEF' version='2.0' allow='yes'/>
          <usbdev class='-1' vendor='-1' product='-1' version='-1' allow='no'/>
        </redirfilter>
    
  3. Start the guest virtual machine and confirm the setting changes by running the following:
    #ps -ef | grep $guest_name
    -device usb-redir,chardev=charredir0,id=redir0,/
    filter=0x08:0x1234:0xBEEF:0x0200:1|-1:-1:-1:-1:0,bus=usb.0,port=3
  4. Plug a USB device into a host physical machine, and use virt-manager to connect to the guest virtual machine.
  5. Click USB device selection in the menu, which will produce the following message: "Some USB devices are blocked by host policy". Click OK to confirm and continue.
    The filter takes effect.
  6. To make sure that the filter captures properly check the USB device vendor and product, then make the following changes in the host physical machine's domain XML to allow for USB redirection.
       <redirfilter>
          <usbdev class='0x08' vendor='0x0951' product='0x1625' version='2.0' allow='yes'/>
          <usbdev allow='no'/>
        </redirfilter>
    
  7. Restart the guest virtual machine, then use virt-viewer to connect to the guest virtual machine. The USB device will now redirect traffic to the guest virtual machine.

16.4. Configuring Device Controllers

Depending on the guest virtual machine architecture, some device buses can appear more than once, with a group of virtual devices tied to a virtual controller. Normally, libvirt can automatically infer such controllers without requiring explicit XML markup, but in some cases it is better to explicitly set a virtual controller element.

  ...
  <devices>
    <controller type='ide' index='0'/>
    <controller type='virtio-serial' index='0' ports='16' vectors='4'/>
    <controller type='virtio-serial' index='1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </controller>
    ...
  </devices>
  ...

Figure 16.15. Domain XML example for virtual controllers

Each controller has a mandatory attribute <controller type>, which must be one of:
  • ide
  • fdc
  • scsi
  • sata
  • usb
  • ccid
  • virtio-serial
  • pci
The <controller> element has a mandatory attribute <controller index> which is the decimal integer describing in which order the bus controller is encountered (for use in controller attributes of <address> elements). When <controller type ='virtio-serial'> there are two additional optional attributes (named ports and vectors), which control how many devices can be connected through the controller.
When <controller type ='scsi'>, there is an optional attribute model model, which can have the following values:
  • auto
  • buslogic
  • ibmvscsi
  • lsilogic
  • lsisas1068
  • lsisas1078
  • virtio-scsi
  • vmpvscsi
When <controller type ='usb'>, there is an optional attribute model model, which can have the following values:
  • piix3-uhci
  • piix4-uhci
  • ehci
  • ich9-ehci1
  • ich9-uhci1
  • ich9-uhci2
  • ich9-uhci3
  • vt82c686b-uhci
  • pci-ohci
  • nec-xhci
Note that if the USB bus needs to be explicitly disabled for the guest virtual machine, <model='none'> may be used. .
For controllers that are themselves devices on a PCI or USB bus, an optional sub-element <address> can specify the exact relationship of the controller to its master bus, with semantics as shown in Section 16.5, “Setting Addresses for Devices”.
An optional sub-element <driver> can specify the driver-specific options. Currently, it only supports attribute queues, which specifies the number of queues for the controller. For best performance, it is recommended to specify a value matching the number of vCPUs.
USB companion controllers have an optional sub-element <master> to specify the exact relationship of the companion to its master controller. A companion controller is on the same bus as its master, so the companion index value should be equal.
An example XML which can be used is as follows:
   
     ...
  <devices>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0' bus='0' slot='4' function='7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0' bus='0' slot='4' function='0' multifunction='on'/>
    </controller>
    ...
  </devices>
  ...
   

Figure 16.16. Domain XML example for USB controllers

PCI controllers have an optional model attribute with the following possible values:
  • pci-root
  • pcie-root
  • pci-bridge
  • dmi-to-pci-bridge
For machine types which provide an implicit PCI bus, the pci-root controller with index='0' is auto-added and required to use PCI devices. pci-root has no address. PCI bridges are auto-added if there are too many devices to fit on the one bus provided by model='pci-root', or a PCI bus number greater than zero was specified. PCI bridges can also be specified manually, but their addresses should only see PCI buses provided by already specified PCI controllers. Leaving gaps in the PCI controller indexes might lead to an invalid configuration. The following XML example can be added to the <devices> section:

  ...
  <devices>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='pci' index='1' model='pci-bridge'>
      <address type='pci' domain='0' bus='0' slot='5' function='0' multifunction='off'/>
    </controller>
  </devices>
  ...

Figure 16.17. Domain XML example for PCI bridge

For machine types which provide an implicit PCI Express (PCIe) bus (for example, the machine types based on the Q35 chipset), the pcie-root controller with index='0' is auto-added to the domain's configuration. pcie-root has also no address, but provides 31 slots (numbered 1-31) and can only be used to attach PCIe devices. In order to connect standard PCI devices on a system which has a pcie-root controller, a pci controller with model='dmi-to-pci-bridge' is automatically added. A dmi-to-pci-bridge controller plugs into a PCIe slot (as provided by pcie-root), and itself provides 31 standard PCI slots (which are not hot-pluggable). In order to have hot-pluggable PCI slots in the guest system, a pci-bridge controller will also be automatically created and connected to one of the slots of the auto-created dmi-to-pci-bridge controller; all guest devices with PCI addresses that are auto-determined by libvirt will be placed on this pci-bridge device.
   
     ...
  <devices>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='dmi-to-pci-bridge'>
      <address type='pci' domain='0' bus='0' slot='0xe' function='0'/>
    </controller>
    <controller type='pci' index='2' model='pci-bridge'>
      <address type='pci' domain='0' bus='1' slot='1' function='0'/>
    </controller>
  </devices>
  ...
   

Figure 16.18. Domain XML example for PCIe (PCI express)

The following XML configuration is used for USB 3.0 / xHCI emulation:
   
     ...
  <devices>
    <controller type='usb' index='3' model='nec-xhci'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0f' function='0x0'/>
    </controller>
  </devices>
    ...

Figure 16.19. Domain XML example for USB3/xHCI devices

16.5. Setting Addresses for Devices

Many devices have an optional <address> sub-element which is used to describe where the device is placed on the virtual bus presented to the guest virtual machine. If an address (or any optional attribute within an address) is omitted on input, libvirt will generate an appropriate address; but an explicit address is required if more control over layout is required. For domain XML device examples that include an <address> element, see Figure 16.9, “XML example for PCI device assignment”.
Every address has a mandatory attribute type that describes which bus the device is on. The choice of which address to use for a given device is constrained in part by the device and the architecture of the guest virtual machine. For example, a <disk> device uses type='drive', while a <console> device would use type='pci' on i686 or x86_64 guest virtual machine architectures. Each address type has further optional attributes that control where on the bus the device will be placed as described in the table:

Table 16.1. Supported device address types

Address type Description
type='pci' PCI addresses have the following additional attributes:
  • domain (a 2-byte hex integer, not currently used by qemu)
  • bus (a hex value between 0 and 0xff, inclusive)
  • slot (a hex value between 0x0 and 0x1f, inclusive)
  • function (a value between 0 and 7, inclusive)
  • multifunction controls turning on the multifunction bit for a particular slot/function in the PCI control register By default it is set to 'off', but should be set to 'on' for function 0 of a slot that will have multiple functions used.
type='drive' Drive addresses have the following additional attributes:
  • controller (a 2-digit controller number)
  • bus (a 2-digit bus number
  • target (a 2-digit bus number)
  • unit (a 2-digit unit number on the bus)
type='virtio-serial' Each virtio-serial address has the following additional attributes:
  • controller (a 2-digit controller number)
  • bus (a 2-digit bus number)
  • slot (a 2-digit slot within the bus)
type='ccid' A CCID address, for smart-cards, has the following additional attributes:
  • bus (a 2-digit bus number)
  • slot attribute (a 2-digit slot within the bus)
type='usb' USB addresses have the following additional attributes:
  • bus (a hex value between 0 and 0xfff, inclusive)
  • port (a dotted notation of up to four octets, such as 1.2 or 2.1.3.1)
type='isa' ISA addresses have the following additional attributes:
  • iobase
  • irq

16.6. Random Number Generator Device

Random number generators are very important for operating system security. For securing virtual operating systems, Red Hat Enterprise Linux 7 includes virtio-rng, a virtual hardware random number generator device that can provide the guest with fresh entropy on request.
On the host physical machine, the hardware RNG interface creates a chardev at /dev/hwrng, which can be opened and then read to fetch entropy from the host physical machine. In co-operation with the rngd daemon, the entropy from the host physical machine can be routed to the guest virtual machine's /dev/random, which is the primary source of randomness.
Using a random number generator is particularly useful when a device such as a keyboard, mouse, and other inputs are not enough to generate sufficient entropy on the guest virtual machine. The virtual random number generator device allows the host physical machine to pass through entropy to guest virtual machine operating systems. This procedure can be performed using either the command line or the virt-manager interface. For more information about virtio-rng, see Red Hat Enterprise Linux Virtual Machines: Access to Random Numbers Made Easy.

Procedure 16.11. Implementing virtio-rng using the Virtual Machine Manager

  1. Shut down the guest virtual machine.
  2. Select the guest virtual machine and from the Edit menu, select Virtual Machine Details, to open the Details window for the specified guest virtual machine.
  3. Click the Add Hardware button.
  4. In the Add New Virtual Hardware window, select RNG to open the Random Number Generator window.
    Random Number Generator window

    Figure 16.20. Random Number Generator window

    Enter the intended parameters and click Finish when done. The parameters are explained in virtio-rng elements.

Procedure 16.12. Implementing virtio-rng using command-line tools

  1. Shut down the guest virtual machine.
  2. Using the virsh edit domain-name command, open the XML file for the intended guest virtual machine.
  3. Edit the <devices> element to include the following:
    
      ...
      <devices>
        <rng model='virtio'>
          <rate period='2000' bytes='1234'/>
          <backend model='random'>/dev/random</backend>
          <!-- OR -->
          <backend model='egd' type='udp'>
            <source mode='bind' service='1234'/>
            <source mode='connect' host='1.2.3.4' service='1234'/>
          </backend>
        </rng>
      </devices>
      ...

    Figure 16.21. Random number generator device

    The random number generator device allows the following XML attributes and elements:

    virtio-rng elements

    • <model> - The required model attribute specifies what type of RNG device is provided.
    • <backend model> - The <backend> element specifies the source of entropy to be used for the guest. The source model is configured using the model attribute. Supported source models include 'random' and 'egd' .
      • <backend model='random'> - This <backend> type expects a non-blocking character device as input. Examples of such devices are /dev/random and /dev/urandom. The file name is specified as contents of the <backend> element. When no file name is specified the hypervisor default is used.
      • <backend model='egd'> - This back end connects to a source using the EGD protocol. The source is specified as a character device. See character device host physical machine interface for more information.

16.7. Assigning GPU Devices

To assign a GPU to a guest, use one of the following method:
  • GPU PCI Device Assignment - Using this method, it is possible to remove a GPU device from the host and assign it to a single guest.
  • NVIDIA vGPU Assignment - This method makes it possible to create multiple mediated devices from a physical GPU, and assign these devices as virtual GPUs to multiple guests. This is only supported on selected NVIDIA GPUs, and only one mediated device can be assigned to a single guest.

16.7.1. GPU PCI Device Assignment

Red Hat Enterprise Linux 7 supports PCI device assignment of the following PCIe-based GPU devices as non-VGA graphics devices:
  • NVIDIA Quadro K-Series, M-Series, and P-Series (models 2000 series or higher)
  • NVIDIA GRID K-Series
  • NVIDIA Tesla K-Series and M-Series
Currently, up to two GPUs may be attached to the virtual machine, in addition to one of the standard emulated VGA interfaces. The emulated VGA is used for pre-boot and installation and the NVIDIA GPU takes over when the NVIDIA graphics drivers are loaded.
To assign a GPU to a guest virtual machine, you must enable the I/O Memory Management Unit (IOMMU) on the host machine, identify the GPU device by using the lspci command, detach the device from host, attach it to the guest, and configure Xorg on the guest - as described in the following procedures:

Procedure 16.13. Enable IOMMU support in the host machine kernel

  1. Edit the kernel command line

    For an Intel VT-d system, IOMMU is activated by adding the intel_iommu=on and iommu=pt parameters to the kernel command line. For an AMD-Vi system, the option needed is amd_iommu=pt. To enable this option, edit or add the GRUB_CMDLINX_LINUX line to the /etc/sysconfig/grub configuration file as follows:
    GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
    vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
    vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ]  &&
    /usr/sbin/rhcrashkernel-param || :) rhgb quiet intel_iommu=on iommu=pt"
    

    Note

    For further information on IOMMU, see Appendix E, Working with IOMMU Groups.
  2. Regenerate the boot loader configuration

    For the changes to the kernel command line to apply, regenerate the boot loader configuration using the grub2-mkconfig command:
    # grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  3. Reboot the host

    For the changes to take effect, reboot the host machine:
    # reboot

Procedure 16.14. Excluding the GPU device from binding to the host physical machine driver

For GPU assignment, it is recommended to exclude the device from binding to host drivers, as these drivers often do not support dynamic unbinding of the device.
  1. Identify the PCI bus address

    To identify the PCI bus address and IDs of the device, run the following lspci command. In this example, a VGA controller such as an NVIDIA Quadro or GRID card is used:
    # lspci -Dnn | grep VGA
    0000:02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK106GL [Quadro K4000] [10de:11fa] (rev a1)
    
    The resulting search reveals that the PCI bus address of this device is 0000:02:00.0 and the PCI IDs for the device are 10de:11fa.
  2. Prevent the native host machine driver from using the GPU device

    To prevent the native host machine driver from using the GPU device, you can use a PCI ID with the pci-stub driver. To do this, append the pci-stub.ids option, with the PCI IDs as its value, to the GRUB_CMDLINX_LINUX line located in the /etc/sysconfig/grub configuration file, for example as follows:
    GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
    vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
    vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ]  &&
    /usr/sbin/rhcrashkernel-param || :) rhgb quiet intel_iommu=on iommu=pt pci-stub.ids=10de:11fa"
    
    To add additional PCI IDs for pci-stub, separate them with a comma.
  3. Regenerate the boot loader configuration

    Regenerate the boot loader configuration using the grub2-mkconfig to include this option:
    # grub2-mkconfig -o /etc/grub2.cfg
    Note that if you are using a UEFI-based host, the target file should be /etc/grub2-efi.cfg.
  4. Reboot the host machine

    In order for the changes to take effect, reboot the host machine:
    # reboot

Procedure 16.15. Optional: Editing the GPU IOMMU configuration

Prior to attaching the GPU device, editing its IOMMU configuration may be needed for the GPU to work properly on the guest.
  1. Display the XML information of the GPU

    To display the settings of the GPU in XML form, you first need to convert its PCI bus address to libvirt-compatible format by appending pci_ and converting delimiters to underscores. In this example, the GPU PCI device identified with the 0000:02:00.0 bus address (as obtained in the previous procedure) becomes pci_0000_02_00_0. Use the libvirt address of the device with the virsh nodedev-dumpxml to display its XML configuration:
    # virsh nodedev-dumpxml pci_0000_02_00_0
    
    <device>
     <name>pci_0000_02_00_0</name>
     <path>/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0</path>
     <parent>pci_0000_00_03_0</parent>
     <driver>
      <name>pci-stub</name>
     </driver>
     <capability type='pci'>
      <domain>0</domain>
      <bus>2</bus>
      <slot>0</slot>
      <function>0</function>
      <product id='0x11fa'>GK106GL [Quadro K4000]</product>
      <vendor id='0x10de'>NVIDIA Corporation</vendor>
         <!-- pay attention to the following lines -->
      <iommuGroup number='13'>
       <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
       <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
      </iommuGroup>
      <pci-express>
       <link validity='cap' port='0' speed='8' width='16'/>
       <link validity='sta' speed='2.5' width='16'/>
      </pci-express>
     </capability>
    </device>
    Note the <iommuGroup> element of the XML. The iommuGroup indicates a set of devices that are considered isolated from other devices due to IOMMU capabilities and PCI bus topologies. All of the endpoint devices within the iommuGroup (meaning devices that are not PCIe root ports, bridges, or switch ports) need to be unbound from the native host drivers in order to be assigned to a guest. In the example above, the group is composed of the GPU device (0000:02:00.0) as well as the companion audio device (0000:02:00.1). For more information, see Appendix E, Working with IOMMU Groups.
  2. Adjust IOMMU settings

    In this example, assignment of NVIDIA audio functions is not supported due to hardware issues with legacy interrupt support. In addition, the GPU audio function is generally not useful without the GPU itself. Therefore, in order to assign the GPU to a guest, the audio function must first be detached from native host drivers. This can be done using one of the following:

Procedure 16.16. Attaching the GPU

The GPU can be attached to the guest using any of the following methods:
  1. Using the Virtual Machine Manager interface. For details, see Section 16.1.2, “Assigning a PCI Device with virt-manager”.
  2. Creating an XML configuration fragment for the GPU and attaching it with the virsh attach-device:
    1. Create an XML for the device, similar to the following:
      
      <hostdev mode='subsystem' type='pci' managed='yes'>
       <driver name='vfio'/>
       <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
       </source>
      </hostdev>
    2. Save this to a file and run virsh attach-device [domain] [file] --persistent to include the XML in the guest configuration. Note that the assigned GPU is added in addition to the existing emulated graphics device in the guest machine. The assigned GPU is handled as a secondary graphics device in the virtual machine. Assignment as a primary graphics device is not supported and emulated graphics devices in the guest's XML should not be removed.
  3. Editing the guest XML configuration using the virsh edit command and adding the appropriate XML segment manually.

Procedure 16.17. Ḿodifying the Xorg configuration on the guest

The GPU's PCI bus address on the guest will be different than on the host. To enable the host to use the GPU properly, configure the guest's Xorg display server to use the assigned GPU address:
  1. In the guest, use the lspci command to determine the PCI bus adress of the GPU:
    # lspci | grep VGA
    00:02.0 VGA compatible controller: Device 1234:111
    00:09.0 VGA compatible controller: NVIDIA Corporation GK106GL [Quadro K4000] (rev a1)
    
    In this example, the bus address is 00:09.0.
  2. In the /etc/X11/xorg.conf file on the guest, add a BusID option with the detected address adjusted as follows:
    		Section "Device"
    		    Identifier     "Device0"
    		    Driver         "nvidia"
    		    VendorName     "NVIDIA Corporation"
    		    BusID          "PCI:0:9:0"
    		EndSection
    

    Important

    If the bus address detected in Step 1 is hexadecimal, you need to convert the values between delimiters to the decimal system. For example, 00:0a.0 should be converted into PCI:0:10:0.

Note

When using an assigned NVIDIA GPU in the guest, only the NVIDIA drivers are supported. Other drivers may not work and may generate errors. For a Red Hat Enterprise Linux 7 guest, the nouveau driver can be blacklisted using the option modprobe.blacklist=nouveau on the kernel command line during install. For information on other guest virtual machines, see the operating system's specific documentation.
Depending on the guest operating system, with the NVIDIA drivers loaded, the guest may support using both the emulated graphics and assigned graphics simultaneously or may disable the emulated graphics. Note that access to the assigned graphics framebuffer is not provided by applications such as virt-manager. If the assigned GPU is not connected to a physical display, guest-based remoting solutions may be necessary to access the GPU desktop. As with all PCI device assignment, migration of a guest with an assigned GPU is not supported and each GPU is owned exclusively by a single guest. Depending on the guest operating system, hot plug support of GPUs may be available.

16.7.2. NVIDIA vGPU Assignment

The NVIDIA vGPU feature makes it possible to divide a physical GPU device into multiple virtual devices referred to as mediated devices. These mediated devices can then be assigned to multiple guests as virtual GPUs. As a result, these guests share the performance of a single physical GPU.

Important

This feature is only available on a limited set of NVIDIA GPUs. For an up-to-date list of these devices, see the NVIDIA GPU Software Documentation.

NVIDIA vGPU Setup

To set up the vGPU feature, you first need to obtain NVIDIA vGPU drivers for your GPU device, then create mediated devices, and assign them to the intended guest machines:
  1. Obtain the NVIDIA vGPU drivers and install them on your system. For instructions, see the NVIDIA documentation.
  2. If the NVIDIA software installer did not create the /etc/modprobe.d/nvidia-installer-disable-nouveau.conf file, create a .conf file (of any name) in the /etc/modprobe.d/ directory. Add the following lines in the file:
    blacklist nouveau
    options nouveau modeset=0
    
  3. Regenerate the initial ramdisk for the current kernel, then reboot:
    # dracut --force
    # reboot
    If you need to use a prior supported kernel version with mediated devices, regenerate the initial ramdisk for all installed kernel versions:
    # dracut --regenerate-all --force
    # reboot
  4. Check that the nvidia_vgpu_vfio module has been loaded by the kernel and that the nvidia-vgpu-mgr.service service is running.
    # lsmod | grep nvidia_vgpu_vfio
    nvidia_vgpu_vfio 45011 0
    nvidia 14333621 10 nvidia_vgpu_vfio
    mdev 20414 2 vfio_mdev,nvidia_vgpu_vfio
    vfio 32695 3 vfio_mdev,nvidia_vgpu_vfio,vfio_iommu_type1
    # systemctl status nvidia-vgpu-mgr.service
    nvidia-vgpu-mgr.service - NVIDIA vGPU Manager Daemon
       Loaded: loaded (/usr/lib/systemd/system/nvidia-vgpu-mgr.service; enabled; vendor preset: disabled)
       Active: active (running) since Fri 2018-03-16 10:17:36 CET; 5h 8min ago
     Main PID: 1553 (nvidia-vgpu-mgr)
     [...]
    
  5. Write a device UUID to /sys/class/mdev_bus/pci_dev/mdev_supported_types/type-id/create, where pci_dev is the PCI address of the host GPU, and type-id is an ID of the host GPU type.
    The following example shows how to create a mediated device of nvidia-63 vGPU type on an NVIDIA Tesla P4 card:
    # uuidgen
    30820a6f-b1a5-4503-91ca-0c10ba58692a
    # echo "30820a6f-b1a5-4503-91ca-0c10ba58692a" > /sys/class/mdev_bus/0000:01:00.0/mdev_supported_types/nvidia-63/create
    For type-id values for specific devices, see section 1.3.1. Virtual GPU Types in Virtual GPU software documentation. Note that only Q-series NVIDIA vGPUs, such as GRID P4-2Q, are supported as mediated device GPU types on Linux guests.
  6. Add the following lines to the <devices/> sections in XML configurations of guests that you want to share the vGPU resources. Use the UUID value generated by the uuidgen command in the previous step. Each UUID can only be assigned to one guest at a time.
    
    <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci'>
      <source>
        <address uuid='30820a6f-b1a5-4503-91ca-0c10ba58692a'/>
      </source>
    </hostdev>
    

    Important

    For the vGPU mediated devices to work properly on the assigned guests, NVIDIA vGPU guest software licensing needs to be set up for the guests. For further information and instructions, see the NVIDIA virtual GPU software documentation.

Removing NVIDIA vGPU Devices

To remove a mediated vGPU device, use the following command when the device is inactive, and replace uuid with the UUID of the device, for example 30820a6f-b1a5-4503-91ca-0c10ba58692a.
# echo 1 > /sys/bus/mdev/devices/uuid/remove
Note that attempting to remove a vGPU device that is currently in use by a guest triggers the following error:
echo: write error: Device or resource busy

Querying NVIDIA vGPU Capabilities

To obtain additional information about the mediated devices on your system, such as how many mediated devices of a given type can be created, use the virsh nodedev-list --cap mdev_types and virsh nodedev-dumpxml commands. For example, the following displays available vGPU types on a Tesla P4 card:

$ virsh nodedev-list --cap mdev_types
pci_0000_01_00_0
$ virsh nodedev-dumpxml pci_0000_01_00_0
<...>
  <capability type='mdev_types'>
    <type id='nvidia-70'>
      <name>GRID P4-8A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>1</availableInstances>
    </type>
    <type id='nvidia-69'>
      <name>GRID P4-4A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>2</availableInstances>
    </type>
    <type id='nvidia-67'>
      <name>GRID P4-1A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>8</availableInstances>
    </type>
    <type id='nvidia-65'>
      <name>GRID P4-4Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>2</availableInstances>
    </type>
    <type id='nvidia-63'>
      <name>GRID P4-1Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>8</availableInstances>
    </type>
    <type id='nvidia-71'>
      <name>GRID P4-1B</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>8</availableInstances>
    </type>
    <type id='nvidia-68'>
      <name>GRID P4-2A</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>4</availableInstances>
    </type>
    <type id='nvidia-66'>
      <name>GRID P4-8Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>1</availableInstances>
    </type>
    <type id='nvidia-64'>
      <name>GRID P4-2Q</name>
      <deviceAPI>vfio-pci</deviceAPI>
      <availableInstances>4</availableInstances>
    </type>
  </capability>
</...>

Remote Desktop Streaming Services for NVIDIA vGPU

The following remote desktop streaming services have been successfully tested for use with the NVIDIA vGPU feature on Red Hat Enterprise Linux 7:
  • HP-RGS
  • Mechdyne TGX - It is currently not possible to use Mechdyne TGX with Windows Server 2016 guests.
  • NICE DCV - When using this streaming service, Red Hat recommends using fixed resolution settings, as using dynamic resolution in some cases results in a black screen.

Chapter 17. Virtual Networking

This chapter introduces the concepts needed to create, start, stop, remove, and modify virtual networks with libvirt.
Additional information can be found in the libvirt reference chapter

17.1. Virtual Network Switches

Libvirt virtual networking uses the concept of a virtual network switch. A virtual network switch is a software construct that operates on a host physical machine server, to which virtual machines (guests) connect. The network traffic for a guest is directed through this switch:
Virtual network switch with two guests

Figure 17.1. Virtual network switch with two guests

Linux host physical machine servers represent a virtual network switch as a network interface. When the libvirtd daemon (libvirtd) is first installed and started, the default network interface representing the virtual network switch is virbr0.
This virbr0 interface can be viewed with the ip command like any other interface:
 $ ip addr show virbr0
 3: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
     link/ether 1b:c4:94:cf:fd:17 brd ff:ff:ff:ff:ff:ff
     inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0

17.2. Bridged Mode

When using Bridged mode, all of the guest virtual machines appear within the same subnet as the host physical machine. All other physical machines on the same physical network are aware of the virtual machines, and can access the virtual machines. Bridging operates on Layer 2 of the OSI networking model.
Virtual network switch in bridged mode

Figure 17.2. Virtual network switch in bridged mode

It is possible to use multiple physical interfaces on the hypervisor by joining them together with a bond. The bond is then added to a bridge and then guest virtual machines are added onto the bridge as well. However, the bonding driver has several modes of operation, and only a few of these modes work with a bridge where virtual guest machines are in use.

Warning

When using bridged mode, the only bonding modes that should be used with a guest virtual machine are Mode 1, Mode 2, and Mode 4. Using modes 0, 3, 5, or 6 is likely to cause the connection to fail. Also note that Media-Independent Interface (MII) monitoring should be used to monitor bonding modes, as Address Resolution Protocol (ARP) monitoring does not work.
For more information on bonding modes, see related Knowledgebase article, or the Red Hat Enterprise Linux 7 Networking Guide.
For a detailed explanation of bridge_opts parameters, used to configure bridged networking mode, see the Red Hat Virtualization Administration Guide.

17.3. Network Address Translation

By default, virtual network switches operate in NAT mode. They use IP masquerading rather than Source-NAT (SNAT) or Destination-NAT (DNAT). IP masquerading enables connected guests to use the host physical machine IP address for communication to any external network. By default, computers that are placed externally to the host physical machine cannot communicate to the guests inside when the virtual network switch is operating in NAT mode, as shown in the following diagram:
Virtual network switch using NAT with two guests

Figure 17.3. Virtual network switch using NAT with two guests

Warning

Virtual network switches use NAT configured by iptables rules. Editing these rules while the switch is running is not recommended, as incorrect rules may result in the switch being unable to communicate.
If the switch is not running, you can set the public IP range for forward mode NAT in order to create a port masquerading range by running:
# iptables -j SNAT --to-source [start]-[end]

17.4. DNS and DHCP

IP information can be assigned to guests via DHCP. A pool of addresses can be assigned to a virtual network switch for this purpose. Libvirt uses the dnsmasq program for this. An instance of dnsmasq is automatically configured and started by libvirt for each virtual network switch that needs it.
Virtual network switch running dnsmasq

Figure 17.4. Virtual network switch running dnsmasq

17.5. Routed Mode

When using Routed mode, the virtual switch connects to the physical LAN connected to the host physical machine, passing traffic back and forth without the use of NAT. The virtual switch can examine all traffic and use the information contained within the network packets to make routing decisions. When using this mode, all of the virtual machines are in their own subnet, routed through a virtual switch. This situation is not always ideal as no other host physical machines on the physical network are aware of the virtual machines without manual physical router configuration, and cannot access the virtual machines. Routed mode operates at Layer 3 of the OSI networking model.
Virtual network switch in routed mode

Figure 17.5. Virtual network switch in routed mode

17.6. Isolated Mode

When using Isolated mode, guests connected to the virtual switch can communicate with each other, and with the host physical machine, but their traffic will not pass outside of the host physical machine, and they cannot receive traffic from outside the host physical machine. Using dnsmasq in this mode is required for basic functionality such as DHCP. However, even if this network is isolated from any physical network, DNS names are still resolved. Therefore, a situation can arise when DNS names resolve but ICMP echo request (ping) commands fail.
Virtual network switch in isolated mode

Figure 17.6. Virtual network switch in isolated mode

17.7. The Default Configuration

When the libvirtd daemon (libvirtd) is first installed, it contains an initial virtual network switch configuration in NAT mode. This configuration is used so that installed guests can communicate to the external network, through the host physical machine. The following image demonstrates this default configuration for libvirtd:
Default libvirt network configuration

Figure 17.7. Default libvirt network configuration

Note

A virtual network can be restricted to a specific physical interface. This may be useful on a physical system that has several interfaces (for example, eth0, eth1 and eth2). This is only useful in routed and NAT modes, and can be defined in the dev=<interface> option, or in virt-manager when creating a new virtual network.

17.8. Examples of Common Scenarios

This section demonstrates different virtual networking modes and provides some example scenarios.

17.8.1. Bridged Mode

Bridged mode operates on Layer 2 of the OSI model. When used, all of the guest virtual machines will appear on the same subnet as the host physical machine. The most common use cases for bridged mode include:
  • Deploying guest virtual machines in an existing network alongside host physical machines making the difference between virtual and physical machines transparent to the end user.
  • Deploying guest virtual machines without making any changes to existing physical network configuration settings.
  • Deploying guest virtual machines which must be easily accessible to an existing physical network. Placing guest virtual machines on a physical network where they must access services within an existing broadcast domain, such as DHCP.
  • Connecting guest virtual machines to an exsting network where VLANs are used.

17.8.2. Routed Mode

DMZ

Consider a network where one or more nodes are placed in a controlled sub-network for security reasons. The deployment of a special sub-network such as this is a common practice, and the sub-network is known as a DMZ. See the following diagram for more details on this layout:

Sample DMZ configuration

Figure 17.8. Sample DMZ configuration

Host physical machines in a DMZ typically provide services to WAN (external) host physical machines as well as LAN (internal) host physical machines. As this requires them to be accessible from multiple locations, and considering that these locations are controlled and operated in different ways based on their security and trust level, routed mode is the best configuration for this environment.
Virtual Server Hosting

Consider a virtual server hosting company that has several host physical machines, each with two physical network connections. One interface is used for management and accounting, the other is for the virtual machines to connect through. Each guest has its own public IP address, but the host physical machines use private IP address as management of the guests can only be performed by internal administrators. See the following diagram to understand this scenario:

Virtual server hosting sample configuration

Figure 17.9. Virtual server hosting sample configuration

17.8.3. NAT Mode

NAT (Network Address Translation) mode is the default mode. It can be used for testing when there is no need for direct network visibility.

17.8.4. Isolated Mode

Isolated mode allows virtual machines to communicate with each other only. They are unable to interact with the physical network.

17.9. Managing a Virtual Network

To configure a virtual network on your system:
  1. From the Edit menu, select Connection Details.
  2. This will open the Connection Details menu. Click the Virtual Networks tab.
    Virtual network configuration

    Figure 17.10. Virtual network configuration

  3. All available virtual networks are listed on the left of the menu. You can edit the configuration of a virtual network by selecting it from this box and editing as you see fit.

17.10. Creating a Virtual Network

To create a virtual network on your system using the Virtual Machine Manager (virt-manager):
  1. Open the Virtual Networks tab from within the Connection Details menu. Click the Add Network button, identified by a plus sign (+) icon. For more information, see Section 17.9, “Managing a Virtual Network”.
    Virtual network configuration

    Figure 17.11. Virtual network configuration

    This will open the Create a new virtual network window. Click Forward to continue.
    Naming your new virtual network

    Figure 17.12. Naming your new virtual network

  2. Enter an appropriate name for your virtual network and click Forward.
    Choosing an IPv4 address space

    Figure 17.13. Choosing an IPv4 address space

  3. Check the Enable IPv4 network address space definition check box.
    Enter an IPv4 address space for your virtual network in the Network field.
    Check the Enable DHCPv4 check box.
    Define the DHCP range for your virtual network by specifying a Start and End range of IP addresses.
    Choosing an IPv4 address space

    Figure 17.14. Choosing an IPv4 address space

    Click Forward to continue.
  4. If you want to enable IPv6, check the Enable IPv6 network address space definition.
    Enabling IPv6

    Figure 17.15. Enabling IPv6

    Additional fields appear in the Create a new virtual network window.
    Configuring IPv6

    Figure 17.16. Configuring IPv6

    Enter an IPv6 address in the Network field.
  5. If you want to enable DHCPv6, check the Enable DHCPv6 check box.
    Additional fields appear in the Create a new virtual network window.
    Configuring DHCPv6

    Figure 17.17. Configuring DHCPv6

    (Optional) Edit the start and end of the DHCPv6 range.
  6. If you want to enable static route definitions, check the Enable Static Route Definition check box.
    Additional fields appear in the Create a new virtual network window.
    Defining static routes

    Figure 17.18. Defining static routes

    Enter a network address and the gateway that will be used for the route to the network in the appropriate fields.
    Click Forward.
  7. Select how the virtual network should connect to the physical network.
    Connecting to the physical network

    Figure 17.19. Connecting to the physical network

    If you want the virtual network to be isolated, ensure that the Isolated virtual network radio button is selected.
    If you want the virtual network to connect to a physical network, select Forwarding to physical network, and choose whether the Destination should be Any physical device or a specific physical device. Also select whether the Mode should be NAT or Routed.
    If you want to enable IPv6 routing within the virtual network, check the Enable IPv6 internal routing/networking check box.
    Enter a DNS domain name for the virtual network.
    Click Finish to create the virtual network.
  8. The new virtual network is now available in the Virtual Networks tab of the Connection Details window.

17.11. Attaching a Virtual Network to a Guest

To attach a virtual network to a guest:
  1. In the Virtual Machine Manager window, highlight the guest that will have the network assigned.
    Selecting a virtual machine to display

    Figure 17.20. Selecting a virtual machine to display

  2. From the Virtual Machine Manager Edit menu, select Virtual Machine Details.
  3. Click the Add Hardware button on the Virtual Machine Details window.
  4. In the Add new virtual hardware window, select Network from the left pane, and select your network name (network1 in this example) from the Network source menu. Modify the MAC address, if necessary, and select a Device model. Click Finish.
    Select your network from the Add new virtual hardware window

    Figure 17.21. Select your network from the Add new virtual hardware window

  5. The new network is now displayed as a virtual network interface that will be presented to the guest upon launch.
    New network shown in guest hardware list

    Figure 17.22. New network shown in guest hardware list

17.12. Attaching a Virtual NIC Directly to a Physical Interface

As an alternative to the default NAT connection, you can use the macvtap driver to attach the guest's NIC directly to a specified physical interface of the host machine. This is not to be confused with device assignment (also known as passthrough). Macvtap connection has the following modes, each with different benefits and usecases:

Physical interface delivery modes

VEPA
In virtual ethernet port aggregator (VEPA) mode, all packets from the guests are sent to the external switch. This enables the user to force guest traffic through the switch. For VEPA mode to work correctly, the external switch must also support hairpin mode, which ensures that packets whose destination is a guest on the same host machine as their source guest are sent back to the host by the external switch.
VEPA mode

Figure 17.23. VEPA mode

bridge
Packets whose destination is on the same host machine as their source guest are directly delivered to the target macvtap device. Both the source device and the destination device need to be in bridge mode for direct delivery to succeed. If either one of the devices is in VEPA mode, a hairpin-capable external switch is required.
Bridge mode

Figure 17.24. Bridge mode

private
All packets are sent to the external switch and will only be delivered to a target guest on the same host machine if they are sent through an external router or gateway and these send them back to the host. Private mode can be used to prevent the individual guests on the single host from communicating with each other. This procedure is followed if either the source or destination device is in private mode.
Private mode

Figure 17.25. Private mode

passthrough
This feature attaches a physical interface device or a SR-IOV Virtual Function (VF) directly to a guest without losing the migration capability. All packets are sent directly to the designated network device. Note that a single network device can only be passed through to a single guest, as a network device cannot be shared between guests in passthrough mode.
Passthrough mode

Figure 17.26. Passthrough mode

Macvtap can be configured by changing the domain XML file or by using the virt-manager interface.

17.12.1. Configuring macvtap using domain XML

Open the domain XML file of the guest and modify the <devices> element as follows:
<devices>
	...
	<interface type='direct'>
		<source dev='eth0' mode='vepa'/>
	</interface>
</devices>
The network access of direct attached guest virtual machines can be managed by the hardware switch to which the physical interface of the host physical machine is connected.
The interface can have additional parameters as shown below, if the switch is conforming to the IEEE 802.1Qbg standard. The parameters of the virtualport element are documented in more detail in the IEEE 802.1Qbg standard. The values are network specific and should be provided by the network administrator. In 802.1Qbg terms, the Virtual Station Interface (VSI) represents the virtual interface of a virtual machine. Also note that IEEE 802.1Qbg requires a non-zero value for the VLAN ID.

Virtual Station Interface types

managerid
The VSI Manager ID identifies the database containing the VSI type and instance definitions. This is an integer value and the value 0 is reserved.
typeid
The VSI Type ID identifies a VSI type characterizing the network access. VSI types are typically managed by network administrator. This is an integer value.
typeidversion
The VSI Type Version allows multiple versions of a VSI Type. This is an integer value.
instanceid
The VSI Instance ID is generated when a VSI instance (a virtual interface of a virtual machine) is created. This is a globally unique identifier.
profileid
The profile ID contains the name of the port profile that is to be applied onto this interface. This name is resolved by the port profile database into the network parameters from the port profile, and those network parameters will be applied to this interface.
Each of the four types is configured by changing the domain XML file. Once this file is opened, change the mode setting as shown:
<devices>
 ...
 <interface type='direct'>
  <source dev='eth0.2' mode='vepa'/>
   <virtualport type="802.1Qbg">
    <parameters managerid="11" typeid="1193047" typeidversion="2" instanceid="09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f"/>
   </virtualport>
  </interface>
</devices>
The profile ID is shown here:
<devices>
 ...
 <interface type='direct'>
  <source dev='eth0' mode='private'/>
   <virtualport type='802.1Qbh'>
    <parameters profileid='finance'/>
   </virtualport>
 </interface>
</devices>
...

17.12.2. Configuring macvtap using virt-manager

Open the virtual hardware details window ⇒ select NIC in the menu ⇒ for Network source, select host device name: macvtap ⇒ select the intended Source mode.
The virtual station interface types can then be set up in the Virtual port submenu.
Configuring macvtap in virt-manager

Figure 17.27. Configuring macvtap in virt-manager

17.13. Dynamically Changing a Host Physical Machine or a Network Bridge that is Attached to a Virtual NIC

This section demonstrates how to move the vNIC of a guest virtual machine from one bridge to another while the guest virtual machine is running without compromising the guest virtual machine
  1. Prepare guest virtual machine with a configuration similar to the following:
    <interface type='bridge'>
          <mac address='52:54:00:4a:c9:5e'/>
          <source bridge='virbr0'/>
          <model type='virtio'/>
    </interface>
    
  2. Prepare an XML file for interface update:
    # cat br1.xml
    <interface type='bridge'>
          <mac address='52:54:00:4a:c9:5e'/>
          <source bridge='virbr1'/>
          <model type='virtio'/>
    </interface>
    
  3. Start the guest virtual machine, confirm the guest virtual machine's network functionality, and check that the guest virtual machine's vnetX is connected to the bridge you indicated.
    # brctl show
    bridge name     bridge id               STP enabled     interfaces
    virbr0          8000.5254007da9f2       yes                  virbr0-nic
    
    vnet0
    virbr1          8000.525400682996       yes                  virbr1-nic
    
  4. Update the guest virtual machine's network with the new interface parameters with the following command:
    # virsh update-device test1 br1.xml 
    
    Device updated successfully
    
    
  5. On the guest virtual machine, run service network restart. The guest virtual machine gets a new IP address for virbr1. Check the guest virtual machine's vnet0 is connected to the new bridge(virbr1)
    # brctl show
    bridge name     bridge id               STP enabled     interfaces
    virbr0          8000.5254007da9f2       yes             virbr0-nic
    virbr1          8000.525400682996       yes             virbr1-nic     vnet0
    

17.14. Applying Network Filtering

This section provides an introduction to libvirt's network filters, their goals, concepts and XML format.

17.14.1. Introduction

The goal of the network filtering, is to enable administrators of a virtualized system to configure and enforce network traffic filtering rules on virtual machines and manage the parameters of network traffic that virtual machines are allowed to send or receive. The network traffic filtering rules are applied on the host physical machine when a virtual machine is started. Since the filtering rules cannot be circumvented from within the virtual machine, it makes them mandatory from the point of view of a virtual machine user.
From the point of view of the guest virtual machine, the network filtering system allows each virtual machine's network traffic filtering rules to be configured individually on a per interface basis. These rules are applied on the host physical machine when the virtual machine is started and can be modified while the virtual machine is running. The latter can be achieved by modifying the XML description of a network filter.
Multiple virtual machines can make use of the same generic network filter. When such a filter is modified, the network traffic filtering rules of all running virtual machines that reference this filter are updated. The machines that are not running will update on start.
As previously mentioned, applying network traffic filtering rules can be done on individual network interfaces that are configured for certain types of network configurations. Supported network types include:
  • network
  • ethernet -- must be used in bridging mode
  • bridge

Example 17.1. An example of network filtering

The interface XML is used to reference a top-level filter. In the following example, the interface description references the filter clean-traffic.
   <devices>
    <interface type='bridge'>
      <mac address='00:16:3e:5d:c7:9e'/>
      <filterref filter='clean-traffic'/>
    </interface>
  </devices>
Network filters are written in XML and may either contain: references to other filters, rules for traffic filtering, or hold a combination of both. The above referenced filter clean-traffic is a filter that only contains references to other filters and no actual filtering rules. Since references to other filters can be used, a tree of filters can be built. The clean-traffic filter can be viewed using the command: # virsh nwfilter-dumpxml clean-traffic.
As previously mentioned, a single network filter can be referenced by multiple virtual machines. Since interfaces will typically have individual parameters associated with their respective traffic filtering rules, the rules described in a filter's XML can be generalized using variables. In this case, the variable name is used in the filter XML and the name and value are provided at the place where the filter is referenced.

Example 17.2. Description extended

In the following example, the interface description has been extended with the parameter IP and a dotted IP address as a value.
  <devices>
    <interface type='bridge'>
      <mac address='00:16:3e:5d:c7:9e'/>
      <filterref filter='clean-traffic'>
        <parameter name='IP' value='10.0.0.1'/>
      </filterref>
    </interface>
  </devices>
In this particular example, the clean-traffic network traffic filter will be represented with the IP address parameter 10.0.0.1 and as per the rule dictates that all traffic from this interface will always be using 10.0.0.1 as the source IP address, which is one of the purpose of this particular filter.

17.14.2. Filtering Chains

Filtering rules are organized in filter chains. These chains can be thought of as having a tree structure with packet filtering rules as entries in individual chains (branches).
Packets start their filter evaluation in the root chain and can then continue their evaluation in other chains, return from those chains back into the root chain or be dropped or accepted by a filtering rule in one of the traversed chains.
Libvirt's network filtering system automatically creates individual root chains for every virtual machine's network interface on which the user chooses to activate traffic filtering. The user may write filtering rules that are either directly instantiated in the root chain or may create protocol-specific filtering chains for efficient evaluation of protocol-specific rules.
The following chains exist:
  • root
  • mac
  • stp (spanning tree protocol)
  • vlan
  • arp and rarp
  • ipv4
  • ipv6
Multiple chains evaluating the mac, stp, vlan, arp, rarp, ipv4, or ipv6 protocol can be created using the protocol name only as a prefix in the chain's name.

Example 17.3. ARP traffic filtering

This example allows chains with names arp-xyz or arp-test to be specified and have their ARP protocol packets evaluated in those chains.
The following filter XML shows an example of filtering ARP traffic in the arp chain.
<filter name='no-arp-spoofing' chain='arp' priority='-500'>
  <uuid>f88f1932-debf-4aa1-9fbe-f10d3aa4bc95</uuid>
  <rule action='drop' direction='out' priority='300'>
    <mac match='no' srcmacaddr='$MAC'/>
  </rule>
  <rule action='drop' direction='out' priority='350'>
    <arp match='no' arpsrcmacaddr='$MAC'/>
  </rule>
  <rule action='drop' direction='out' priority='400'>
    <arp match='no' arpsrcipaddr='$IP'/>
  </rule>
  <rule action='drop' direction='in' priority='450'>
    <arp opcode='Reply'/>
    <arp match='no' arpdstmacaddr='$MAC'/>
  </rule>
  <rule action='drop' direction='in' priority='500'>
    <arp match='no' arpdstipaddr='$IP'/>
  </rule>
  <rule action='accept' direction='inout' priority='600'>
    <arp opcode='Request'/>
  </rule>
  <rule action='accept' direction='inout' priority='650'>
    <arp opcode='Reply'/>
  </rule>
  <rule action='drop' direction='inout' priority='1000'/>
</filter>
The consequence of putting ARP-specific rules in the arp chain, rather than for example in the root chain, is that packets protocols other than ARP do not need to be evaluated by ARP protocol-specific rules. This improves the efficiency of the traffic filtering. However, one must then pay attention to only putting filtering rules for the given protocol into the chain since other rules will not be evaluated. For example, an IPv4 rule will not be evaluated in the ARP chain since IPv4 protocol packets will not traverse the ARP chain.

17.14.3. Filtering Chain Priorities

As previously mentioned, when creating a filtering rule, all chains are connected to the root chain. The order in which those chains are accessed is influenced by the priority of the chain. The following table shows the chains that can be assigned a priority and their default priorities.

Table 17.1. Filtering chain default priorities values

Chain (prefix) Default priority
stp -810
mac -800
vlan -750
ipv4 -700
ipv6 -600
arp -500
rarp -400

Note

A chain with a lower priority value is accessed before one with a higher value.
The chains listed in Table 17.1, “Filtering chain default priorities values” can be also be assigned custom priorities by writing a value in the range [-1000 to 1000] into the priority (XML) attribute in the filter node. Section 17.14.2, “Filtering Chains”filter shows the default priority of -500 for arp chains, for example.

17.14.4. Usage of Variables in Filters

There are two variables that have been reserved for usage by the network traffic filtering subsystem: MAC and IP.
MAC is designated for the MAC address of the network interface. A filtering rule that references this variable will automatically be replaced with the MAC address of the interface. This works without the user having to explicitly provide the MAC parameter. Even though it is possible to specify the MAC parameter similar to the IP parameter above, it is discouraged since libvirt knows what MAC address an interface will be using.
The parameter IP represents the IP address that the operating system inside the virtual machine is expected to use on the given interface. The IP parameter is special in so far as the libvirt daemon will try to determine the IP address (and thus the IP parameter's value) that is being used on an interface if the parameter is not explicitly provided but referenced. For current limitations on IP address detection, consult the section on limitations Section 17.14.12, “Limitations” on how to use this feature and what to expect when using it. The XML file shown in Section 17.14.2, “Filtering Chains” contains the filter no-arp-spoofing, which is an example of using a network filter XML to reference the MAC and IP variables.
Note that referenced variables are always prefixed with the character $. The format of the value of a variable must be of the type expected by the filter attribute identified in the XML. In the above example, the IP parameter must hold a legal IP address in standard format. Failure to provide the correct structure will result in the filter variable not being replaced with a value and will prevent a virtual machine from starting or will prevent an interface from attaching when hot plugging is being used. Some of the types that are expected for each XML attribute are shown in the example Example 17.4, “Sample variable types”.

Example 17.4. Sample variable types

As variables can contain lists of elements, (the variable IP can contain multiple IP addresses that are valid on a particular interface, for example), the notation for providing multiple elements for the IP variable is:
  <devices>
    <interface type='bridge'>
      <mac address='00:16:3e:5d:c7:9e'/>
      <filterref filter='clean-traffic'>
        <parameter name='IP' value='10.0.0.1'/>
        <parameter name='IP' value='10.0.0.2'/>
        <parameter name='IP' value='10.0.0.3'/>
      </filterref>
    </interface>
  </devices>
This XML file creates filters to enable multiple IP addresses per interface. Each of the IP addresses will result in a separate filtering rule. Therefore, using the XML above and the following rule, three individual filtering rules (one for each IP address) will be created:
  <rule action='accept' direction='in' priority='500'>
    <tcp srpipaddr='$IP'/>
  </rule>
As it is possible to access individual elements of a variable holding a list of elements, a filtering rule like the following accesses the 2nd element of the variable DSTPORTS.
  <rule action='accept' direction='in' priority='500'>
    <udp dstportstart='$DSTPORTS[1]'/>
  </rule>

Example 17.5. Using a variety of variables

As it is possible to create filtering rules that represent all of the permissible rules from different lists using the notation $VARIABLE[@<iterator id="x">]. The following rule allows a virtual machine to receive traffic on a set of ports, which are specified in DSTPORTS, from the set of source IP address specified in SRCIPADDRESSES. The rule generates all combinations of elements of the variable DSTPORTS with those of SRCIPADDRESSES by using two independent iterators to access their elements.
  <rule action='accept' direction='in' priority='500'>
    <ip srcipaddr='$SRCIPADDRESSES[@1]' dstportstart='$DSTPORTS[@2]'/>
  </rule>
Assign concrete values to SRCIPADDRESSES and DSTPORTS as shown:
  SRCIPADDRESSES = [ 10.0.0.1, 11.1.2.3 ]
  DSTPORTS = [ 80, 8080 ]
Assigning values to the variables using $SRCIPADDRESSES[@1] and $DSTPORTS[@2] would then result in all variants of addresses and ports being created as shown:
  • 10.0.0.1, 80
  • 10.0.0.1, 8080
  • 11.1.2.3, 80
  • 11.1.2.3, 8080
Accessing the same variables using a single iterator, for example by using the notation $SRCIPADDRESSES[@1] and $DSTPORTS[@1], would result in parallel access to both lists and result in the following combination:
  • 10.0.0.1, 80
  • 11.1.2.3, 8080

Note

$VARIABLE is short-hand for $VARIABLE[@0]. The former notation always assumes the role of iterator with iterator id="0" added as shown in the opening paragraph at the top of this section.

17.14.5. Automatic IP Address Detection and DHCP Snooping

This section provides information about automatic IP address detection and DHCP snooping.

17.14.5.1. Introduction

The detection of IP addresses used on a virtual machine's interface is automatically activated if the variable IP is referenced but no value has been assigned to it. The variable CTRL_IP_LEARNING can be used to specify the IP address learning method to use. Valid values include: any, dhcp, or none.
The value any instructs libvirt to use any packet to determine the address in use by a virtual machine, which is the default setting if the variable CTRL_IP_LEARNING is not set. This method will only detect a single IP address per interface. Once a guest virtual machine's IP address has been detected, its IP network traffic will be locked to that address, if for example, IP address spoofing is prevented by one of its filters. In that case, the user of the VM will not be able to change the IP address on the interface inside the guest virtual machine, which would be considered IP address spoofing. When a guest virtual machine is migrated to another host physical machine or resumed after a suspend operation, the first packet sent by the guest virtual machine will again determine the IP address that the guest virtual machine can use on a particular interface.
The value of dhcp instructs libvirt to only honor DHCP server-assigned addresses with valid leases. This method supports the detection and usage of multiple IP address per interface. When a guest virtual machine resumes after a suspend operation, any valid IP address leases are applied to its filters. Otherwise the guest virtual machine is expected to use DHCP to obtain a new IP addresses. When a guest virtual machine migrates to another physical host physical machine, the guest virtual machine is required to re-run the DHCP protocol.
If CTRL_IP_LEARNING is set to none, libvirt does not do IP address learning and referencing IP without assigning it an explicit value is an error.

17.14.5.2. DHCP Snooping

CTRL_IP_LEARNING=dhcp (DHCP snooping) provides additional anti-spoofing security, especially when combined with a filter allowing only trusted DHCP servers to assign IP addresses. To enable this, set the variable DHCPSERVER to the IP address of a valid DHCP server and provide filters that use this variable to filter incoming DHCP responses.
When DHCP snooping is enabled and the DHCP lease expires, the guest virtual machine will no longer be able to use the IP address until it acquires a new, valid lease from a DHCP server. If the guest virtual machine is migrated, it must get a new valid DHCP lease to use an IP address (for example by bringing the VM interface down and up again).

Note

Automatic DHCP detection listens to the DHCP traffic the guest virtual machine exchanges with the DHCP server of the infrastructure. To avoid denial-of-service attacks on libvirt, the evaluation of those packets is rate-limited, meaning that a guest virtual machine sending an excessive number of DHCP packets per second on an interface will not have all of those packets evaluated and thus filters may not get adapted. Normal DHCP client behavior is assumed to send a low number of DHCP packets per second. Further, it is important to setup appropriate filters on all guest virtual machines in the infrastructure to avoid them being able to send DHCP packets. Therefore, guest virtual machines must either be prevented from sending UDP and TCP traffic from port 67 to port 68 or the DHCPSERVER variable should be used on all guest virtual machines to restrict DHCP server messages to only be allowed to originate from trusted DHCP servers. At the same time anti-spoofing prevention must be enabled on all guest virtual machines in the subnet.

Example 17.6. Activating IPs for DHCP snooping

The following XML provides an example for the activation of IP address learning using the DHCP snooping method:
    <interface type='bridge'>
      <source bridge='virbr0'/>
      <filterref filter='clean-traffic'>
        <parameter name='CTRL_IP_LEARNING' value='dhcp'/>
      </filterref>
    </interface>

17.14.6. Reserved Variables

Table 17.2, “Reserved variables” shows the variables that are considered reserved and are used by libvirt:

Table 17.2. Reserved variables

Variable Name Definition
MAC The MAC address of the interface
IP The list of IP addresses in use by an interface
IPV6 Not currently implemented: the list of IPV6 addresses in use by an interface
DHCPSERVER The list of IP addresses of trusted DHCP servers
DHCPSERVERV6 Not currently implemented: The list of IPv6 addresses of trusted DHCP servers
CTRL_IP_LEARNING The choice of the IP address detection mode

17.14.7. Element and Attribute Overview

The root element required for all network filters is named <filter> with two possible attributes. The name attribute provides a unique name of the given filter. The chain attribute is optional but allows certain filters to be better organized for more efficient processing by the firewall subsystem of the underlying host physical machine. Currently, the system only supports the following chains: root, ipv4, ipv6, arp and rarp.

17.14.8. References to Other Filters

Any filter may hold references to other filters. Individual filters may be referenced multiple times in a filter tree but references between filters must not introduce loops.

Example 17.7. An Example of a clean traffic filter

The following shows the XML of the clean-traffic network filter referencing several other filters.
<filter name='clean-traffic'>
  <uuid>6ef53069-ba34-94a0-d33d-17751b9b8cb1</uuid>
  <filterref filter='no-mac-spoofing'/>
  <filterref filter='no-ip-spoofing'/>
  <filterref filter='allow-incoming-ipv4'/>
  <filterref filter='no-arp-spoofing'/>
  <filterref filter='no-other-l2-traffic'/>
  <filterref filter='qemu-announce-self'/>
</filter>
To reference another filter, the XML node <filterref> needs to be provided inside a filter node. This node must have the attribute filter whose value contains the name of the filter to be referenced.
New network filters can be defined at any time and may contain references to network filters that are not known to libvirt, yet. However, once a virtual machine is started or a network interface referencing a filter is to be hot-plugged, all network filters in the filter tree must be available. Otherwise the virtual machine will not start or the network interface cannot be attached.

17.14.9. Filter Rules

The following XML shows a simple example of a network traffic filter implementing a rule to drop traffic if the IP address (provided through the value of the variable IP) in an outgoing IP packet is not the expected one, thus preventing IP address spoofing by the VM.

Example 17.8. Example of network traffic filtering

<filter name='no-ip-spoofing' chain='ipv4'>
  <uuid>fce8ae33-e69e-83bf-262e-30786c1f8072</uuid>
  <rule action='drop' direction='out' priority='500'>
    <ip match='no' srcipaddr='$IP'/>
  </rule>
</filter>
The traffic filtering rule starts with the rule node. This node may contain up to three of the following attributes:
  • action is mandatory can have the following values:
    • drop (matching the rule silently discards the packet with no further analysis)
    • reject (matching the rule generates an ICMP reject message with no further analysis)
    • accept (matching the rule accepts the packet with no further analysis)
    • return (matching the rule passes this filter, but returns control to the calling filter for further analysis)
    • continue (matching the rule goes on to the next rule for further analysis)
  • direction is mandatory can have the following values:
    • in for incoming traffic
    • out for outgoing traffic
    • inout for incoming and outgoing traffic
  • priority is optional. The priority of the rule controls the order in which the rule will be instantiated relative to other rules. Rules with lower values will be instantiated before rules with higher values. Valid values are in the range of -1000 to 1000. If this attribute is not provided, priority 500 will be assigned by default. Note that filtering rules in the root chain are sorted with filters connected to the root chain following their priorities. This allows to interleave filtering rules with access to filter chains. See Section 17.14.3, “Filtering Chain Priorities” for more information.
  • statematch is optional. Possible values are '0' or 'false' to turn the underlying connection state matching off. The default setting is 'true' or 1
The above example Example 17.7, “An Example of a clean traffic filter” indicates that the traffic of type ip will be associated with the chain ipv4 and the rule will have priority=500. If for example another filter is referenced whose traffic of type ip is also associated with the chain ipv4 then that filter's rules will be ordered relative to the priority=500 of the shown rule.
A rule may contain a single rule for filtering of traffic. The above example shows that traffic of type ip is to be filtered.

17.14.10. Supported Protocols

The following sections list and give some details about the protocols that are supported by the network filtering subsystem. This type of traffic rule is provided in the rule node as a nested node. Depending on the traffic type a rule is filtering, the attributes are different. The above example showed the single attribute srcipaddr that is valid inside the ip traffic filtering node. The following sections show what attributes are valid and what type of data they are expecting. The following datatypes are available:
  • UINT8 : 8 bit integer; range 0-255
  • UINT16: 16 bit integer; range 0-65535
  • MAC_ADDR: MAC address in dotted decimal format, for example 00:11:22:33:44:55
  • MAC_MASK: MAC address mask in MAC address format, for instance, FF:FF:FF:FC:00:00
  • IP_ADDR: IP address in dotted decimal format, for example 10.1.2.3
  • IP_MASK: IP address mask in either dotted decimal format (255.255.248.0) or CIDR mask (0-32)
  • IPV6_ADDR: IPv6 address in numbers format, for example FFFF::1
  • IPV6_MASK: IPv6 mask in numbers format (FFFF:FFFF:FC00::) or CIDR mask (0-128)
  • STRING: A string
  • BOOLEAN: 'true', 'yes', '1' or 'false', 'no', '0'
  • IPSETFLAGS: The source and destination flags of the ipset described by up to 6 'src' or 'dst' elements selecting features from either the source or destination part of the packet header; example: src,src,dst. The number of 'selectors' to provide here depends on the type of ipset that is referenced
Every attribute except for those of type IP_MASK or IPV6_MASK can be negated using the match attribute with value no. Multiple negated attributes may be grouped together. The following XML fragment shows such an example using abstract attributes.
[...]
  <rule action='drop' direction='in'>
    <protocol match='no' attribute1='value1' attribute2='value2'/>
    <protocol attribute3='value3'/>
  </rule>
[...]
Rules behave evaluate the rule as well as look at it logically within the boundaries of the given protocol attributes. Thus, if a single attribute's value does not match the one given in the rule, the whole rule will be skipped during the evaluation process. Therefore, in the above example incoming traffic will only be dropped if: the protocol property attribute1 does not match both value1 and the protocol property attribute2 does not match value2 and the protocol property attribute3 matches value3.

17.14.10.1. MAC (Ethernet)

Protocol ID: mac
Rules of this type should go into the root chain.

Table 17.3. MAC protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to MAC address of sender
dstmacaddr MAC_ADDR MAC address of destination
dstmacmask MAC_MASK Mask applied to MAC address of destination
protocolid UINT16 (0x600-0xffff), STRING Layer 3 protocol ID. Valid strings include [arp, rarp, ipv4, ipv6]
comment STRING text string up to 256 characters
The filter can be written as such:
[...]
<mac match='no' srcmacaddr='$MAC'/>
[...]

17.14.10.2. VLAN (802.1Q)

Protocol ID: vlan
Rules of this type should go either into the root or vlan chain.

Table 17.4. VLAN protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to MAC address of sender
dstmacaddr MAC_ADDR MAC address of destination
dstmacmask MAC_MASK Mask applied to MAC address of destination
vlan-id UINT16 (0x0-0xfff, 0 - 4095) VLAN ID
encap-protocol UINT16 (0x03c-0xfff), String Encapsulated layer 3 protocol ID, valid strings are arp, ipv4, ipv6
comment STRING text string up to 256 characters

17.14.10.3. STP (Spanning Tree Protocol)

Protocol ID: stp
Rules of this type should go either into the root or stp chain.

Table 17.5. STP protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to MAC address of sender
type UINT8 Bridge Protocol Data Unit (BPDU) type
flags UINT8 BPDU flagdstmacmask
root-priority UINT16 Root priority range start
root-priority-hi UINT16 (0x0-0xfff, 0 - 4095) Root priority range end
root-address MAC _ADDRESS root MAC Address
root-address-mask MAC _MASK root MAC Address mask
roor-cost UINT32 Root path cost (range start)
root-cost-hi UINT32 Root path cost range end
sender-priority-hi UINT16 Sender priority range end
sender-address MAC_ADDRESS BPDU sender MAC address
sender-address-mask MAC_MASK BPDU sender MAC address mask
port UINT16 Port identifier (range start)
port_hi UINT16 Port identifier range end
msg-age UINT16 Message age timer (range start)
msg-age-hi UINT16 Message age timer range end
max-age-hi UINT16 Maximum age time range end
hello-time UINT16 Hello time timer (range start)
hello-time-hi UINT16 Hello time timer range end
forward-delay UINT16 Forward delay (range start)
forward-delay-hi UINT16 Forward delay range end
comment STRING text string up to 256 characters

17.14.10.4. ARP/RARP

Protocol ID: arp or rarp
Rules of this type should either go into the root or arp/rarp chain.

Table 17.6. ARP and RARP protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to MAC address of sender
dstmacaddr MAC_ADDR MAC address of destination
dstmacmask MAC_MASK Mask applied to MAC address of destination
hwtype UINT16 Hardware type
protocoltype UINT16 Protocol type
opcode UINT16, STRING Opcode valid strings are: Request, Reply, Request_Reverse, Reply_Reverse, DRARP_Request, DRARP_Reply, DRARP_Error, InARP_Request, ARP_NAK
arpsrcmacaddr MAC_ADDR Source MAC address in ARP/RARP packet
arpdstmacaddr MAC _ADDR Destination MAC address in ARP/RARP packet
arpsrcipaddr IP_ADDR Source IP address in ARP/RARP packet
arpdstipaddr IP_ADDR Destination IP address in ARP/RARP packet
gratuitous BOOLEAN Boolean indicating whether to check for a gratuitous ARP packet
comment STRING text string up to 256 characters

17.14.10.5. IPv4

Protocol ID: ip
Rules of this type should either go into the root or ipv4 chain.

Table 17.7. IPv4 protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to MAC address of sender
dstmacaddr MAC_ADDR MAC address of destination
dstmacmask MAC_MASK Mask applied to MAC address of destination
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
protocol UINT8, STRING Layer 4 protocol identifier. Valid strings for protocol are: tcp, udp, udplite, esp, ah, icmp, igmp, sctp
srcportstart UINT16 Start of range of valid source ports; requires protocol
srcportend UINT16 End of range of valid source ports; requires protocol
dstportstart UNIT16 Start of range of valid destination ports; requires protocol
dstportend UNIT16 End of range of valid destination ports; requires protocol
comment STRING text string up to 256 characters

17.14.10.6. IPv6

Protocol ID: ipv6
Rules of this type should either go into the root or ipv6 chain.

Table 17.8. IPv6 protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to MAC address of sender
dstmacaddr MAC_ADDR MAC address of destination
dstmacmask MAC_MASK Mask applied to MAC address of destination
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
protocol UINT8, STRING Layer 4 protocol identifier. Valid strings for protocol are: tcp, udp, udplite, esp, ah, icmpv6, sctp
scrportstart UNIT16 Start of range of valid source ports; requires protocol
srcportend UINT16 End of range of valid source ports; requires protocol
dstportstart UNIT16 Start of range of valid destination ports; requires protocol
dstportend UNIT16 End of range of valid destination ports; requires protocol
comment STRING text string up to 256 characters

17.14.10.7. TCP/UDP/SCTP

Protocol ID: tcp, udp, sctp
The chain parameter is ignored for this type of traffic and should either be omitted or set to root.

Table 17.9. TCP/UDP/SCTP protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
scripto IP_ADDR Start of range of source IP address
srcipfrom IP_ADDR End of range of source IP address
dstipfrom IP_ADDR Start of range of destination IP address
dstipto IP_ADDR End of range of destination IP address
scrportstart UNIT16 Start of range of valid source ports; requires protocol
srcportend UINT16 End of range of valid source ports; requires protocol
dstportstart UNIT16 Start of range of valid destination ports; requires protocol
dstportend UNIT16 End of range of valid destination ports; requires protocol
comment STRING text string up to 256 characters
state STRING comma separated list of NEW,ESTABLISHED,RELATED,INVALID or NONE
flags STRING TCP-only: format of mask/flags with mask and flags each being a comma separated list of SYN,ACK,URG,PSH,FIN,RST or NONE or ALL
ipset STRING The name of an IPSet managed outside of libvirt
ipsetflags IPSETFLAGS flags for the IPSet; requires ipset attribute

17.14.10.8. ICMP

Protocol ID: icmp
Note: The chain parameter is ignored for this type of traffic and should either be omitted or set to root.

Table 17.10. ICMP protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to the MAC address of the sender
dstmacaddr MAD_ADDR MAC address of the destination
dstmacmask MAC_MASK Mask applied to the MAC address of the destination
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
srcipfrom IP_ADDR start of range of source IP address
scripto IP_ADDR end of range of source IP address
dstipfrom IP_ADDR Start of range of destination IP address
dstipto IP_ADDR End of range of destination IP address
type UNIT16 ICMP type
code UNIT16 ICMP code
comment STRING text string up to 256 characters
state STRING comma separated list of NEW,ESTABLISHED,RELATED,INVALID or NONE
ipset STRING The name of an IPSet managed outside of libvirt
ipsetflags IPSETFLAGS flags for the IPSet; requires ipset attribute

17.14.10.9. IGMP, ESP, AH, UDPLITE, 'ALL'

Protocol ID: igmp, esp, ah, udplite, all
The chain parameter is ignored for this type of traffic and should either be omitted or set to root.

Table 17.11. IGMP, ESP, AH, UDPLITE, 'ALL'

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcmacmask MAC_MASK Mask applied to the MAC address of the sender
dstmacaddr MAD_ADDR MAC address of the destination
dstmacmask MAC_MASK Mask applied to the MAC address of the destination
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
srcipfrom IP_ADDR start of range of source IP address
scripto IP_ADDR end of range of source IP address
dstipfrom IP_ADDR Start of range of destination IP address
dstipto IP_ADDR End of range of destination IP address
comment STRING text string up to 256 characters
state STRING comma separated list of NEW,ESTABLISHED,RELATED,INVALID or NONE
ipset STRING The name of an IPSet managed outside of libvirt
ipsetflags IPSETFLAGS flags for the IPSet; requires ipset attribute

17.14.10.10. TCP/UDP/SCTP over IPV6

Protocol ID: tcp-ipv6, udp-ipv6, sctp-ipv6
The chain parameter is ignored for this type of traffic and should either be omitted or set to root.

Table 17.12. TCP, UDP, SCTP over IPv6 protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
srcipfrom IP_ADDR start of range of source IP address
scripto IP_ADDR end of range of source IP address
dstipfrom IP_ADDR Start of range of destination IP address
dstipto IP_ADDR End of range of destination IP address
srcportstart UINT16 Start of range of valid source ports
srcportend UINT16 End of range of valid source ports
dstportstart UINT16 Start of range of valid destination ports
dstportend UINT16 End of range of valid destination ports
comment STRING text string up to 256 characters
state STRING comma separated list of NEW,ESTABLISHED,RELATED,INVALID or NONE
ipset STRING The name of an IPSet managed outside of libvirt
ipsetflags IPSETFLAGS flags for the IPSet; requires ipset attribute

17.14.10.11. ICMPv6

Protocol ID: icmpv6
The chain parameter is ignored for this type of traffic and should either be omitted or set to root.

Table 17.13. ICMPv6 protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
srcipfrom IP_ADDR start of range of source IP address
scripto IP_ADDR end of range of source IP address
dstipfrom IP_ADDR Start of range of destination IP address
dstipto IP_ADDR End of range of destination IP address
type UINT16 ICMPv6 type
code UINT16 ICMPv6 code
comment STRING text string up to 256 characters
state STRING comma separated list of NEW,ESTABLISHED,RELATED,INVALID or NONE
ipset STRING The name of an IPSet managed outside of libvirt
ipsetflags IPSETFLAGS flags for the IPSet; requires ipset attribute

17.14.10.12. IGMP, ESP, AH, UDPLITE, 'ALL' over IPv6

Protocol ID: igmp-ipv6, esp-ipv6, ah-ipv6, udplite-ipv6, all-ipv6
The chain parameter is ignored for this type of traffic and should either be omitted or set to root.

Table 17.14. IGMP, ESP, AH, UDPLITE, 'ALL' over IPv6 protocol types

Attribute Name Datatype Definition
srcmacaddr MAC_ADDR MAC address of sender
srcipaddr IP_ADDR Source IP address
srcipmask IP_MASK Mask applied to source IP address
dstipaddr IP_ADDR Destination IP address
dstipmask IP_MASK Mask applied to destination IP address
srcipfrom IP_ADDR start of range of source IP address
scripto IP_ADDR end of range of source IP address
dstipfrom IP_ADDR Start of range of destination IP address
dstipto IP_ADDR End of range of destination IP address
comment STRING text string up to 256 characters
state STRING comma separated list of NEW,ESTABLISHED,RELATED,INVALID or NONE
ipset STRING The name of an IPSet managed outside of libvirt
ipsetflags IPSETFLAGS flags for the IPSet; requires ipset attribute

17.14.11. Advanced Filter Configuration Topics

The following sections discuss advanced filter configuration topics.

17.14.11.1. Connection tracking

The network filtering subsystem (on Linux) makes use of the connection tracking support of IP tables. This helps in enforcing the direction of the network traffic (state match) as well as counting and limiting the number of simultaneous connections towards a guest virtual machine. As an example, if a guest virtual machine has TCP port 8080 open as a server, clients may connect to the guest virtual machine on port 8080. Connection tracking and enforcement of the direction and then prevents the guest virtual machine from initiating a connection from (TCP client) port 8080 to the host physical machine back to a remote host physical machine. More importantly, tracking helps to prevent remote attackers from establishing a connection back to a guest virtual machine. For example, if the user inside the guest virtual machine established a connection to port 80 on an attacker site, the attacker will not be able to initiate a connection from TCP port 80 back towards the guest virtual machine. By default the connection state match that enables connection tracking and then enforcement of the direction of traffic is turned on.

Example 17.9. XML example for turning off connections to the TCP port

The following shows an example XML fragment where this feature has been turned off for incoming connections to TCP port 12345.
   [...]
    <rule direction='in' action='accept' statematch='false'>
      <cp dstportstart='12345'/>
    </rule>
   [...]
This now allows incoming traffic to TCP port 12345, but would also enable the initiation from (client) TCP port 12345 within the VM, which may or may not be desirable.

17.14.11.2. Limiting number of connections

To limit the number of connections a guest virtual machine may establish, a rule must be provided that sets a limit of connections for a given type of traffic. If for example a VM is supposed to be allowed to only ping one other IP address at a time and is supposed to have only one active incoming ssh connection at a time.

Example 17.10. XML sample file that sets limits to connections

The following XML fragment can be used to limit connections
  [...]
  <rule action='drop' direction='in' priority='400'>
    <tcp connlimit-above='1'/>
  </rule>
  <rule action='accept' direction='in' priority='500'>
    <tcp dstportstart='22'/>
  </rule>
  <rule action='drop' direction='out' priority='400'>
    <icmp connlimit-above='1'/>
  </rule>
  <rule action='accept' direction='out' priority='500'>
    <icmp/>
  </rule>
  <rule action='accept' direction='out' priority='500'>
    <udp dstportstart='53'/>
  </rule>
  <rule action='drop' direction='inout' priority='1000'>
    <all/>
  </rule>
  [...]

Note

Limitation rules must be listed in the XML prior to the rules for accepting traffic. According to the XML file in Example 17.10, “XML sample file that sets limits to connections”, an additional rule for allowing DNS traffic sent to port 22 go out the guest virtual machine, has been added to avoid ssh sessions not getting established for reasons related to DNS lookup failures by the ssh daemon. Leaving this rule out may result in the ssh client hanging unexpectedly as it tries to connect. Additional caution should be used in regards to handling timeouts related to tracking of traffic. An ICMP ping that the user may have terminated inside the guest virtual machine may have a long timeout in the host physical machine's connection tracking system and will therefore not allow another ICMP ping to go through.
The best solution is to tune the timeout in the host physical machine's sysfs with the following command:# echo 3 > /proc/sys/net/netfilter/nf_conntrack_icmp_timeout. This command sets the ICMP connection tracking timeout to 3 seconds. The effect of this is that once one ping is terminated, another one can start after 3 seconds.
If for any reason the guest virtual machine has not properly closed its TCP connection, the connection to be held open for a longer period of time, especially if the TCP timeout value was set for a large amount of time on the host physical machine. In addition, any idle connection may result in a timeout in the connection tracking system which can be re-activated once packets are exchanged.
However, if the limit is set too low, newly initiated connections may force an idle connection into TCP backoff. Therefore, the limit of connections should be set rather high so that fluctuations in new TCP connections do not cause odd traffic behavior in relation to idle connections.

17.14.11.3. Command-line tools

virsh has been extended with life-cycle support for network filters. All commands related to the network filtering subsystem start with the prefix nwfilter. The following commands are available:
  • nwfilter-list : lists UUIDs and names of all network filters
  • nwfilter-define : defines a new network filter or updates an existing one (must supply a name)
  • nwfilter-undefine : deletes a specified network filter (must supply a name). Do not delete a network filter currently in use.
  • nwfilter-dumpxml : displays a specified network filter (must supply a name)
  • nwfilter-edit : edits a specified network filter (must supply a name)

17.14.11.4. Pre-existing network filters

The following is a list of example network filters that are automatically installed with libvirt:

Table 17.15. ICMPv6 protocol types

Protocol Name Description
allow-arp Accepts all incoming and outgoing Address Resolution Protocol (ARP) traffic to a guest virtual machine.
no-arp-spoofing, no-arp-mac-spoofing, and no-arp-ip-spoofing These filters prevent a guest virtual machine from spoofing ARP traffic. In addition, they only allows ARP request and reply messages, and enforce that those packets contain:
  • no-arp-spoofing - the MAC and IP addresses of the guest
  • no-arp-mac-spoofing - the MAC address of the guest
  • no-arp-ip-spoofing - the IP address of the guest
low-dhcp Allows a guest virtual machine to request an IP address via DHCP (from any DHCP server).
low-dhcp-server Allows a guest virtual machine to request an IP address from a specified DHCP server. The dotted decimal IP address of the DHCP server must be provided in a reference to this filter. The name of the variable must be DHCPSERVER.
low-ipv4 Accepts all incoming and outgoing IPv4 traffic to a virtual machine.
low-incoming-ipv4 Accepts only incoming IPv4 traffic to a virtual machine. This filter is a part of the clean-traffic filter.
no-ip-spoofing Prevents a guest virtual machine from sending IP packets with a source IP address different from the one inside the packet. This filter is a part of the clean-traffic filter.
no-ip-multicast Prevents a guest virtual machine from sending IP multicast packets.
no-mac-broadcast Prevents outgoing IPv4 traffic to a specified MAC address. This filter is a part of the clean-traffic filter.
no-other-l2-traffic Prevents all layer 2 networking traffic except traffic specified by other filters used by the network. This filter is a part of the clean-traffic filter.
no-other-rarp-traffic, qemu-announce-self, qemu-announce-self-rarp These filters allow QEMU's self-announce Reverse Address Resolution Protocol (RARP) packets, but prevent all other RARP traffic. All of them are also included in the clean-traffic filter.
clean-traffic Prevents MAC, IP and ARP spoofing. This filter references several other filters as building blocks.
These filters are only building blocks and require a combination with other filters to provide useful network traffic filtering. The most used one in the above list is the clean-traffic filter. This filter itself can for example be combined with the no-ip-multicast filter to prevent virtual machines from sending IP multicast traffic on top of the prevention of packet spoofing.

17.14.11.5. Writing your own filters

Since libvirt only provides a couple of example networking filters, you may consider writing your own. When planning on doing so there are a couple of things you may need to know regarding the network filtering subsystem and how it works internally. Certainly you also have to know and understand the protocols very well that you want to be filtering on so that no further traffic than what you want can pass and that in fact the traffic you want to allow does pass.
The network filtering subsystem is currently only available on Linux host physical machines and only works for QEMU and KVM type of virtual machines. On Linux, it builds upon the support for ebtables, iptables and ip6tables and makes use of their features. Considering the list found in Section 17.14.10, “Supported Protocols” the following protocols can be implemented using ebtables:
  • mac
  • stp (spanning tree protocol)
  • vlan (802.1Q)
  • arp, rarp
  • ipv4
  • ipv6
Any protocol that runs over IPv4 is supported using iptables, those over IPv6 are implemented using ip6tables.
Using a Linux host physical machine, all traffic filtering rules created by libvirt's network filtering subsystem first passes through the filtering support implemented by ebtables and only afterwards through iptables or ip6tables filters. If a filter tree has rules with the protocols including: mac, stp, vlan arp, rarp, ipv4, or ipv6; the ebtable rules and values listed will automatically be used first.
Multiple chains for the same protocol can be created. The name of the chain must have a prefix of one of the previously enumerated protocols. To create an additional chain for handling of ARP traffic, a chain with name arp-test, can for example be specified.
As an example, it is possible to filter on UDP traffic by source and destination ports using the ip protocol filter and specifying attributes for the protocol, source and destination IP addresses and ports of UDP packets that are to be accepted. This allows early filtering of UDP traffic with ebtables. However, once an IP or IPv6 packet, such as a UDP packet, has passed the ebtables layer and there is at least one rule in a filter tree that instantiates iptables or ip6tables rules, a rule to let the UDP packet pass will also be necessary to be provided for those filtering layers. This can be achieved with a rule containing an appropriate udp or udp-ipv6 traffic filtering node.

Example 17.11. Creating a custom filter

Suppose a filter is needed to fulfill the following list of requirements:
  • prevents a VM's interface from MAC, IP and ARP spoofing
  • opens only TCP ports 22 and 80 of a VM's interface
  • allows the VM to send ping traffic from an interface but not let the VM be pinged on the interface
  • allows the VM to do DNS lookups (UDP towards port 53)
The requirement to prevent spoofing is fulfilled by the existing clean-traffic network filter, thus the way to do this is to reference it from a custom filter.
To enable traffic for TCP ports 22 and 80, two rules are added to enable this type of traffic. To allow the guest virtual machine to send ping traffic a rule is added for ICMP traffic. For simplicity reasons, general ICMP traffic will be allowed to be initiated from the guest virtual machine, and will not be specified to ICMP echo request and response messages. All other traffic will be prevented to reach or be initiated by the guest virtual machine. To do this a rule will be added that drops all other traffic. Assuming the guest virtual machine is called test and the interface to associate our filter with is called eth0, a filter is created named test-eth0.
The result of these considerations is the following network filter XML:
<filter name='test-eth0'>
  <!- - This rule references the clean traffic filter to prevent MAC, IP and ARP spoofing. By not providing an IP address parameter, libvirt will detect the IP address the guest virtual machine is using. - ->
  <filterref filter='clean-traffic'/>

  <!- - This rule enables TCP ports 22 (ssh) and 80 (http) to be reachable - ->
  <rule action='accept' direction='in'>
    <tcp dstportstart='22'/>
  </rule>

  <rule action='accept' direction='in'>
    <tcp dstportstart='80'/>
  </rule>

  <!- - This rule enables general ICMP traffic to be initiated by the guest virtual machine including ping traffic - ->
  <rule action='accept' direction='out'>
    <icmp/>
  </rule>>

  <!- - This rule enables outgoing DNS lookups using UDP - ->
  <rule action='accept' direction='out'>
    <udp dstportstart='53'/>
  </rule>

  <!- - This rule drops all other traffic - ->
  <rule action='drop' direction='inout'>
    <all/>
  </rule>

</filter>

17.14.11.6. Sample custom filter

Although one of the rules in the above XML contains the IP address of the guest virtual machine as either a source or a destination address, the filtering of the traffic works correctly. The reason is that whereas the rule's evaluation occurs internally on a per-interface basis, the rules are additionally evaluated based on which (tap) interface has sent or will receive the packet, rather than what their source or destination IP address may be.

Example 17.12. Sample XML for network interface descriptions

An XML fragment for a possible network interface description inside the domain XML of the test guest virtual machine could then look like this:
   [...]
    <interface type='bridge'>
      <source bridge='mybridge'/>
      <filterref filter='test-eth0'/>
    </interface>
   [...]
To more strictly control the ICMP traffic and enforce that only ICMP echo requests can be sent from the guest virtual machine and only ICMP echo responses be received by the guest virtual machine, the above ICMP rule can be replaced with the following two rules:
  <!- - enable outgoing ICMP echo requests- ->
  <rule action='accept' direction='out'>
    <icmp type='8'/>
  </rule>
  <!- - enable incoming ICMP echo replies- ->
  <rule action='accept' direction='in'>
    <icmp type='0'/>
  </rule>

Example 17.13. Second example custom filter

This example demonstrates how to build a similar filter as in the example above, but extends the list of requirements with an ftp server located inside the guest virtual machine. The requirements for this filter are:
  • prevents a guest virtual machine's interface from MAC, IP, and ARP spoofing
  • opens only TCP ports 22 and 80 in a guest virtual machine's interface
  • allows the guest virtual machine to send ping traffic from an interface but does not allow the guest virtual machine to be pinged on the interface
  • allows the guest virtual machine to do DNS lookups (UDP towards port 53)
  • enables the ftp server (in active mode) so it can run inside the guest virtual machine
The additional requirement of allowing an FTP server to be run inside the guest virtual machine maps into the requirement of allowing port 21 to be reachable for FTP control traffic as well as enabling the guest virtual machine to establish an outgoing TCP connection originating from the guest virtual machine's TCP port 20 back to the FTP client (FTP active mode). There are several ways of how this filter can be written and two possible solutions are included in this example.
The first solution makes use of the state attribute of the TCP protocol that provides a hook into the connection tracking framework of the Linux host physical machine. For the guest virtual machine-initiated FTP data connection (FTP active mode) the RELATED state is used to enable detection that the guest virtual machine-initiated FTP data connection is a consequence of ( or 'has a relationship with' ) an existing FTP control connection, thereby allowing it to pass packets through the firewall. The RELATED state, however, is only valid for the very first packet of the outgoing TCP connection for the FTP data path. Afterwards, the state is ESTABLISHED, which then applies equally to the incoming and outgoing direction. All this is related to the FTP data traffic originating from TCP port 20 of the guest virtual machine. This then leads to the following solution:
<filter name='test-eth0'>
  <!- - This filter (eth0) references the clean traffic filter to prevent MAC, IP, and ARP spoofing. By not providing an IP address parameter, libvirt will detect the IP address the guest virtual machine is using. - ->
  <filterref filter='clean-traffic'/>

  <!- - This rule enables TCP port 21 (FTP-control) to be reachable - ->
  <rule action='accept' direction='in'>
    <tcp dstportstart='21'/>
  </rule>

  <!- - This rule enables TCP port 20 for guest virtual machine-initiated FTP data connection related to an existing FTP control connection - ->
  <rule action='accept' direction='out'>
    <tcp srcportstart='20' state='RELATED,ESTABLISHED'/>
  </rule>

  <!- - This rule accepts all packets from a client on the FTP data connection - ->
  <rule action='accept' direction='in'>
    <tcp dstportstart='20' state='ESTABLISHED'/>
  </rule>

  <!- - This rule enables TCP port 22 (SSH) to be reachable - ->
  <rule action='accept' direction='in'>
    <tcp dstportstart='22'/>
  </rule>

  <!- -This rule enables TCP port 80 (HTTP) to be reachable - ->
  <rule action='accept' direction='in'>
    <tcp dstportstart='80'/>
  </rule>

  <!- - This rule enables general ICMP traffic to be initiated by the guest virtual machine, including ping traffic - ->
  <rule action='accept' direction='out'>
    <icmp/>
  </rule>

  <!- - This rule enables outgoing DNS lookups using UDP - ->
  <rule action='accept' direction='out'>
    <udp dstportstart='53'/>
  </rule>

  <!- - This rule drops all other traffic - ->
  <rule action='drop' direction='inout'>
    <all/>
  </rule>

</filter>
Before trying out a filter using the RELATED state, you have to make sure that the appropriate connection tracking module has been loaded into the host physical machine's kernel. Depending on the version of the kernel, you must run either one of the following two commands before the FTP connection with the guest virtual machine is established:
  • # modprobe nf_conntrack_ftp - where available OR
  • # modprobe ip_conntrack_ftp if above is not available
If protocols other than FTP are used in conjunction with the RELATED state, their corresponding module must be loaded. Modules are available for the protocols: ftp, tftp, irc, sip, sctp, and amanda.
The second solution makes use of the state flags of connections more than the previous solution did. This solution takes advantage of the fact that the NEW state of a connection is valid when the very first packet of a traffic flow is detected. Subsequently, if the very first packet of a flow is accepted, the flow becomes a connection and thus enters into the ESTABLISHED state. Therefore, a general rule can be written for allowing packets of ESTABLISHED connections to reach the guest virtual machine or be sent by the guest virtual machine. This is done writing specific rules for the very first packets identified by the NEW state and dictates the ports that the data is acceptable. All packets meant for ports that are not explicitly accepted are dropped, thus not reaching an ESTABLISHED state. Any subsequent packets sent from that port are dropped as well.
<filter name='test-eth0'>
  <!- - This filter references the clean traffic filter to prevent MAC, IP and ARP spoofing. By not providing and IP address parameter, libvirt will detect the IP address the VM is using. - ->
  <filterref filter='clean-traffic'/>

  <!- - This rule allows the packets of all previously accepted connections to reach the guest virtual machine - ->
  <rule action='accept' direction='in'>
    <all state='ESTABLISHED'/>
  </rule>

  <!- - This rule allows the packets of all previously accepted and related connections be sent from the guest virtual machine - ->
  <rule action='accept' direction='out'>
    <all state='ESTABLISHED,RELATED'/>
  </rule>

  <!- - This rule enables traffic towards port 21 (FTP) and port 22 (SSH)- ->
  <rule action='accept' direction='in'>
    <tcp dstportstart='21' dstportend='22' state='NEW'/>
  </rule>

  <!- - This rule enables traffic towards port 80 (HTTP) - ->
  <rule action='accept' direction='in'>
    <tcp dstportstart='80' state='NEW'/>
  </rule>

  <!- - This rule enables general ICMP traffic to be initiated by the guest virtual machine, including ping traffic - ->
  <rule action='accept' direction='out'>
    <icmp state='NEW'/>
  </rule>

  <!- - This rule enables outgoing DNS lookups using UDP - ->
  <rule action='accept' direction='out'>
    <udp dstportstart='53' state='NEW'/>
  </rule>

  <!- - This rule drops all other traffic - ->
  <rule action='drop' direction='inout'>
    <all/>
  </rule>

</filter>

17.14.12. Limitations

The following is a list of the currently known limitations of the network filtering subsystem.
  • VM migration is only supported if the whole filter tree that is referenced by a guest virtual machine's top level filter is also available on the target host physical machine. The network filter clean-traffic for example should be available on all libvirt installations and thus enable migration of guest virtual machines that reference this filter. To assure version compatibility is not a problem make sure you are using the most current version of libvirt by updating the package regularly.
  • Migration must occur between libvirt insallations of version 0.8.1 or later in order not to lose the network traffic filters associated with an interface.
  • VLAN (802.1Q) packets, if sent by a guest virtual machine, cannot be filtered with rules for protocol IDs arp, rarp, ipv4 and ipv6. They can only be filtered with protocol IDs, MAC and VLAN. Therefore, the example filter clean-traffic Example 17.1, “An example of network filtering” will not work as expected.

17.15. Creating Tunnels

This section will demonstrate how to implement different tunneling scenarios.

17.15.1. Creating Multicast Tunnels

A multicast group is setup to represent a virtual network. Any guest virtual machines whose network devices are in the same multicast group can talk to each other even across host physical machines. This mode is also available to unprivileged users. There is no default DNS or DHCP support and no outgoing network access. To provide outgoing network access, one of the guest virtual machines should have a second NIC which is connected to one of the first four network types thus providing appropriate routing. The multicast protocol is compatible the guest virtual machine user mode. Note that the source address that you provide must be from the address used for the multicast address block.
To create a multicast tunnel place the following XML details into the <devices> element:
      
  ...
  <devices>
    <interface type='mcast'>
      <mac address='52:54:00:6d:90:01'>
      <source address='230.0.0.1' port='5558'/>
    </interface>
  </devices>
  ...


Figure 17.28. Multicast tunnel domain XMl example

17.15.2. Creating TCP Tunnels

A TCP client-server architecture provides a virtual network. In this configuration, one guest virtual machine provides the server end of the network while all other guest virtual machines are configured as clients. All network traffic is routed between the guest virtual machine clients via the guest virtual machine server. This mode is also available for unprivileged users. Note that this mode does not provide default DNS or DHCP support and it does not provide outgoing network access. To provide outgoing network access, one of the guest virtual machines should have a second NIC which is connected to one of the first four network types thus providing appropriate routing.
To create a TCP tunnel place the following XML details into the <devices> element:
      
       ...
  <devices>
    <interface type='server'>
      <mac address='52:54:00:22:c9:42'>
      <source address='192.168.0.1' port='5558'/>
    </interface>
    ...
    <interface type='client'>
      <mac address='52:54:00:8b:c9:51'>
      <source address='192.168.0.1' port='5558'/>
    </interface>
  </devices>
  ...
      

Figure 17.29. TCP tunnel domain XMl example

17.16. Setting vLAN Tags

virtual local area network (vLAN) tags are added using the virsh net-edit command. This tag can also be used with PCI device assignment with SR-IOV devices. For more information, see Section 16.2.3, “Configuring PCI Assignment with SR-IOV Devices”.

<network>
  <name>ovs-net</name>
  <forward mode='bridge'/>
  <bridge name='ovsbr0'/>
  <virtualport type='openvswitch'>
    <parameters interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
  </virtualport>
  <vlan trunk='yes'>
    <tag id='42' nativeMode='untagged'/>
    <tag id='47'/>
  </vlan>
  <portgroup name='dontpanic'>
    <vlan>
      <tag id='42'/>
    </vlan>
  </portgroup>
</network>

Figure 17.30. vSetting VLAN tag (on supported network types only)

If (and only if) the network type supports vlan tagging transparent to the guest, an optional <vlan> element can specify one or more vlan tags to apply to the traffic of all guests using this network. (openvswitch and type='hostdev' SR-IOV networks do support transparent vlan tagging of guest traffic; everything else, including standard linux bridges and libvirt's own virtual networks, do not support it. 802.1Qbh (vn-link) and 802.1Qbg (VEPA) switches provide their own way (outside of libvirt) to tag guest traffic onto specific vlans.) As expected, the tag attribute specifies which vlan tag to use. If a network has more than one <vlan> element defined, it is assumed that the user wants to do VLAN trunking using all the specified tags. If vlan trunking with a single tag is required, the optional attribute trunk='yes' can be added to the vlan element.
For network connections using openvswitch it is possible to configure the 'native-tagged' and 'native-untagged' vlan modes. This uses the optional nativeMode attribute on the <tag> element: nativeMode may be set to 'tagged' or 'untagged'. The id attribute of the element sets the native vlan.
<vlan> elements can also be specified in a <portgroup> element, as well as directly in a domain's <interface> element. If a vlan tag is specified in multiple locations, the setting in <interface> takes precedence, followed by the setting in the <portgroup> selected by the interface config. The <vlan> in <network> will be selected only if none is given in <portgroup> or <interface>.

17.17. Applying QoS to Your Virtual Network

Quality of Service (QoS) refers to the resource control systems that guarantees an optimal experience for all users on a network, making sure that there is no delay, jitter, or packet loss. QoS can be application specific or user / group specific. See Section 23.18.9.14, “Quality of service (QoS)” for more information.

Chapter 18. Remote Management of Guests

This section explains how to remotely manage your guests.

18.1. Transport Modes

For remote management, libvirt supports the following transport modes:
Transport Layer Security (TLS)

Transport Layer Security TLS 1.0 (SSL 3.1) authenticated and encrypted TCP/IP socket, usually listening on a public port number. To use this, you will need to generate client and server certificates. The standard port is 16514. For detailed instructions, see Section 18.3, “Remote Management over TLS and SSL”.

SSH

Transported over a Secure Shell protocol (SSH) connection. The libvirt daemon (libvirtd) must be running on the remote machine. Port 22 must be open for SSH access. You should use some sort of SSH key management (for example, the ssh-agent utility) or you will be prompted for a password. For detailed instructions, see Section 18.2, “Remote Management with SSH”.

UNIX Sockets

UNIX domain sockets are only accessible on the local machine. Sockets are not encrypted, and use UNIX permissions or SELinux for authentication. The standard socket names are /var/run/libvirt/libvirt-sock and /var/run/libvirt/libvirt-sock-ro (for read-only connections).

ext

The ext parameter is used for any external program which can make a connection to the remote machine by means outside the scope of libvirt. This parameter is unsupported.

TCP

Unencrypted TCP/IP socket. Not recommended for production use, this is normally disabled, but an administrator can enable it for testing or use over a trusted network. The default port is 16509.

The default transport, if no other is specified, is TLS.
Remote URIs

A Uniform Resource Identifier (URI) is used by virsh and libvirt to connect to a remote host. URIs can also be used with the --connect parameter for the virsh command to execute single commands or migrations on remote hosts. Remote URIs are formed by taking ordinary local URIs and adding a host name or a transport name, or both. As a special case, using a URI scheme of 'remote' will tell the remote libvirtd server to probe for the optimal hypervisor driver. This is equivalent to passing a NULL URI for a local connection

libvirt URIs take the general form (content in square brackets, "[]", represents optional functions):
driver[+transport]://[username@][hostname][:port]/path[?extraparameters]
Note that if the hypervisor (driver) is QEMU, the path is mandatory.
The following are examples of valid remote URIs:
  • qemu://hostname/
The transport method or the host name must be provided to target an external location. For more information, see the libvirt upstream documentation.

Examples of remote management parameters

  • Connect to a remote KVM host named host2, using SSH transport and the SSH user name virtuser. The connect command for each is connect [URI] [--readonly]. For more information about the virsh connect command, see Section 20.4, “Connecting to the Hypervisor with virsh Connect”
    qemu+ssh://virtuser@host2/
  • Connect to a remote KVM hypervisor on the host named host2 using TLS.
    qemu://host2/

Testing examples

  • Connect to the local KVM hypervisor with a non-standard UNIX socket. The full path to the UNIX socket is supplied explicitly in this case.
    qemu+unix:///system?socket=/opt/libvirt/run/libvirt/libvirt-sock
  • Connect to the libvirt daemon with an non-encrypted TCP/IP connection to the server with the IP address 10.1.1.10 on port 5000. This uses the test driver with default settings.
    test+tcp://10.1.1.10:5000/default
Extra URI Parameters

Extra parameters can be appended to remote URIs. The table below covers the recognized parameters. All other parameters are ignored. Note that parameter values must be URI-escaped (that is, a question mark (?) is appended before the parameter and special characters are converted into the URI format).

Table 18.1. Extra URI parameters

Name Transport mode Description Example usage
name all modes The name passed to the remote virConnectOpen function. The name is normally formed by removing transport, hostname, port number, username, and extra parameters from the remote URI, but in certain very complex cases it may be better to supply the name explicitly. name=qemu:///system
command ssh and ext The external command. For ext transport this is required. For ssh the default is ssh. The PATH is searched for the command. command=/opt/openssh/bin/ssh
socket unix and ssh The path to the UNIX domain socket, which overrides the default. For ssh transport, this is passed to the remote netcat command (see netcat). socket=/opt/libvirt/run/libvirt/libvirt-sock
no_verify tls If set to a non-zero value, this disables client checks of the server's certificate. Note that to disable server checks of the client's certificate or IP address you must change the libvirtd configuration. no_verify=1
no_tty ssh If set to a non-zero value, this stops ssh from asking for a password if it cannot log in to the remote machine automatically . Use this when you do not have access to a terminal. no_tty=1

18.2. Remote Management with SSH

The ssh package provides an encrypted network protocol that can securely send management functions to remote virtualization servers. The method described below uses the libvirt management connection, securely tunneled over an SSH connection, to manage the remote machines. All the authentication is done using SSH public key cryptography and passwords or passphrases gathered by your local SSH agent. In addition, the VNC console for each guest is tunneled over SSH.
When using using SSH for remotely managing your virtual machines, be aware of the following problems:
  • You require root log in access to the remote machine for managing virtual machines.
  • The initial connection setup process may be slow.
  • There is no standard or trivial way to revoke a user's key on all hosts or guests.
  • SSH does not scale well with larger numbers of remote machines.

Note

Red Hat Virtualization enables remote management of large numbers of virtual machines. For further details, see the Red Hat Virtualization documentation.
The following packages are required for SSH access:
  • openssh
  • openssh-askpass
  • openssh-clients
  • openssh-server
Configuring Password-less or Password-managed SSH Access for virt-manager

The following instructions assume you are starting from scratch and do not already have SSH keys set up. If you have SSH keys set up and copied to the other systems, you can skip this procedure.

Important

SSH keys are user-dependent and may only be used by their owners. A key's owner is the user who generated it. Keys may not be shared across different users.
virt-manager must be run by the user who owns the keys to connect to the remote host. That means, if the remote systems are managed by a non-root user, virt-manager must be run in unprivileged mode. If the remote systems are managed by the local root user, then the SSH keys must be owned and created by root.
You cannot manage the local host as an unprivileged user with virt-manager.
  1. Optional: Changing user

    Change user, if required. This example uses the local root user for remotely managing the other hosts and the local host.
    $ su -
  2. Generating the SSH key pair

    Generate a public key pair on the machine where virt-manager is used. This example uses the default key location, in the ~/.ssh/ directory.
    # ssh-keygen -t rsa
  3. Copying the keys to the remote hosts

    Remote login without a password, or with a pass-phrase, requires an SSH key to be distributed to the systems being managed. Use the ssh-copy-id command to copy the key to root user at the system address provided (in the example, root@host2.example.com).
    # ssh-copy-id -i ~/.ssh/id_rsa.pub root@host2.example.com
    root@host2.example.com's password:
    
    Afterwards, try logging into the machine and check the .ssh/authorized_keys file to make sure unexpected keys have not been added:
    ssh root@host2.example.com
    Repeat for other systems, as required.
  4. Optional: Add the passphrase to the ssh-agent

    Add the pass-phrase for the SSH key to the ssh-agent, if required. On the local host, use the following command to add the pass-phrase (if there was one) to enable password-less login.
    # ssh-add ~/.ssh/id_rsa
    This command will fail to run if the ssh-agent is not running. To avoid errors or conflicts, make sure that your SSH parameters are set correctly. See the Red Hat Enterprise System Administration Guide for more information.
The libvirt daemon (libvirtd)

The libvirt daemon provides an interface for managing virtual machines. You must have the libvirtd daemon installed and running on every remote host that you intend to manage this way.

$ ssh root@somehost
# systemctl enable libvirtd.service
# systemctl start libvirtd.service
After libvirtd and SSH are configured, you should be able to remotely access and manage your virtual machines. You should also be able to access your guests with VNC at this point.
Accessing Remote Hosts with virt-manager

Remote hosts can be managed with the virt-manager GUI tool. SSH keys must belong to the user executing virt-manager for password-less login to work.

  1. Start virt-manager.
  2. Open the FileAdd Connection menu.
    Add connection menu

    Figure 18.1. Add connection menu

  3. Use the drop down menu to select hypervisor type, and click the Connect to remote host check box to open the Connection Method (in this case Remote tunnel over SSH), enter the User name and Hostname, then click Connect.

18.3. Remote Management over TLS and SSL

You can manage virtual machines using the TLS and SSL protocols. TLS and SSL provides greater scalability but is more complicated than SSH (refer to Section 18.2, “Remote Management with SSH”). TLS and SSL is the same technology used by web browsers for secure connections. The libvirt management connection opens a TCP port for incoming connections, which is securely encrypted and authenticated based on x509 certificates. The following procedures provide instructions on creating and deploying authentication certificates for TLS and SSL management.

Procedure 18.1. Creating a certificate authority (CA) key for TLS management

  1. Before you begin, confirm that gnutls-utils is installed. If not, install it:
    # yum install gnutls-utils 
  2. Generate a private key, using the following command:
    # certtool --generate-privkey > cakey.pem
  3. After the key is generated, create a signature file so the key can be self-signed. To do this, create a file with signature details and name it ca.info. This file should contain the following:
    cn = Name of your organization
    ca
    cert_signing_key
    
  4. Generate the self-signed key with the following command:
    # certtool --generate-self-signed --load-privkey cakey.pem --template ca.info --outfile cacert.pem
    After the file is generated, the ca.info file can be deleted using the rm command. The file that results from the generation process is named cacert.pem. This file is the public key (certificate). The loaded file cakey.pem is the private key. For security purposes, this file should be kept private, and not reside in a shared space.
  5. Install the cacert.pem CA certificate file on all clients and servers in the /etc/pki/CA/cacert.pem directory to let them know that the certificate issued by your CA can be trusted. To view the contents of this file, run:
    # certtool -i --infile cacert.pem
    This is all that is required to set up your CA. Keep the CA's private key safe, as you will need it in order to issue certificates for your clients and servers.

Procedure 18.2. Issuing a server certificate

This procedure demonstrates how to issue a certificate with the X.509 Common Name (CN) field set to the host name of the server. The CN must match the host name which clients will be using to connect to the server. In this example, clients will be connecting to the server using the URI: qemu://mycommonname/system, so the CN field should be identical, for this example "mycommoname".
  1. Create a private key for the server.
    # certtool --generate-privkey > serverkey.pem
  2. Generate a signature for the CA's private key by first creating a template file called server.info. Make sure that the CN is set to be the same as the server's host name:
    organization = Name of your organization
    cn = mycommonname
    tls_www_server
    encryption_key
    signing_key
    
  3. Create the certificate:
    # certtool --generate-certificate --load-privkey serverkey.pem --load-ca-certificate cacert.pem --load-ca-privkey cakey.pem \ --template server.info --outfile servercert.pem
    This results in two files being generated:
    • serverkey.pem - The server's private key
    • servercert.pem - The server's public key
  4. Make sure to keep the location of the private key secret. To view the contents of the file, use the following command:
    # certtool -i --infile servercert.pem
    When opening this file, the CN= parameter should be the same as the CN that you set earlier. For example, mycommonname.
  5. Install the two files in the following locations:
    • serverkey.pem - the server's private key. Place this file in the following location: /etc/pki/libvirt/private/serverkey.pem
    • servercert.pem - the server's certificate. Install it in the following location on the server: /etc/pki/libvirt/servercert.pem

Procedure 18.3. Issuing a client certificate

  1. For every client (that is to say any program linked with libvirt, such as virt-manager), you need to issue a certificate with the X.509 Distinguished Name (DN) field set to a suitable name. This needs to be decided on a corporate level.
    For example purposes, the following information will be used:
    C=USA,ST=North Carolina,L=Raleigh,O=Red Hat,CN=name_of_client
  2. Create a private key:
    # certtool --generate-privkey > clientkey.pem
  3. Generate a signature for the CA's private key by first creating a template file called client.info. The file should contain the following (fields should be customized to reflect your region/location):
    country = USA
    state = North Carolina
    locality = Raleigh
    organization = Red Hat
    cn = client1
    tls_www_client
    encryption_key
    signing_key
    
  4. Sign the certificate with the following command:
    # certtool --generate-certificate --load-privkey clientkey.pem --load-ca-certificate cacert.pem \ --load-ca-privkey cakey.pem --template client.info --outfile clientcert.pem
  5. Install the certificates on the client machine:
    # cp clientkey.pem /etc/pki/libvirt/private/clientkey.pem
    # cp clientcert.pem /etc/pki/libvirt/clientcert.pem

18.4. Configuring a VNC Server

To set up graphical desktop sharing between the host and the guest machine using Virtual Network Computing (VNC), a VNC server has to be configured on the guest you wish to connect to. To do this, VNC has to be specified as a graphics type in the devices element of the guest's XML file. For further information, see Section 23.18.12, “Graphical Framebuffers”.
To connect to a VNC server, use the virt-viewer utility or the virt-manager interface.

18.5. Enhancing Remote Management of Virtual Machines with NSS

In Red Hat Enterprise Linux 7.3 and later, you can use the libvirt Network Security Services (NSS) module to make it easier to connect to guests with SSH, TLS, SSL, as well as other remote login services. In addition, the module also benefits utilities that use host name translation, such as ping.
To be able to use this functionality, install the libvirt-nss package:
# yum install libvirt-nss

Note

If installing libvirt-nss fails, make sure that the Optional repository for Red Hat Enterprise Linux is enabled. For instructions, see the System Administrator's Guide.
Finally, enable the module by adding libvirt to the hosts line of the /etc/nsswitch.conf file, for example as follows:
passwd:      compat
shadow:      compat
group:       compat
hosts:       files libvirt dns
...
The order in which modules are listed on the hosts line determines the order in which these modules are consulted to find the specified remote guest. As a result, libvirt's NSS module is added to modules that translate host domain names to IP addresses. This for example enables connecting to a remote guest in NAT mode without setting a static IP address and only using the guest's hostname value:
# ssh root@guest-hostname
root@guest-hostname's password:
Last login: Thu Aug 10 09:12:31 2017 from 192.168.122.1
[root@guest1-rhel7 ~]#

Note

The guest's hostname may differ from the guest name displayed for example by virsh list. To display or configure the hostname on the guest, use the hostnamectl commands.

Chapter 19. Managing Guests with the Virtual Machine Manager (virt-manager)

This chapter describes the Virtual Machine Manager (virt-manager) windows, dialog boxes, and various GUI controls.
virt-manager provides a graphical view of hypervisors and guests on your host system and on remote host systems. virt-manager can perform virtualization management tasks, including:
  • defining and creating guests,
  • assigning memory,
  • assigning virtual CPUs,
  • monitoring operational performance,
  • saving and restoring, pausing and resuming, and shutting down and starting guests,
  • links to the textual and graphical consoles, and
  • live and offline migrations.

Important

It is important to note which user you are using. If you create a guest virtual machine with one user, you will not be able to retrieve information about it using another user. This is especially important when you create a virtual machine in virt-manager. The default user is root in that case unless otherwise specified. Should you have a case where you cannot list the virtual machine using the virsh list --all command, it is most likely due to you running the command using a different user than you used to create the virtual machine.

19.1. Starting virt-manager

To start virt-manager session open the Applications menu, then the System Tools menu and select Virtual Machine Manager (virt-manager).
The virt-manager main window appears.
Starting virt-manager

Figure 19.1. Starting virt-manager

Alternatively, virt-manager can be started remotely using ssh as demonstrated in the following command:
# ssh -X host's address
[remotehost]# virt-manager
Using ssh to manage virtual machines and hosts is discussed further in Section 18.2, “Remote Management with SSH”.

19.2. The Virtual Machine Manager Main Window

This main window displays all the running guests and resources used by guests. Select a guest by double clicking the guest's name.
Virtual Machine Manager main window

Figure 19.2. Virtual Machine Manager main window

19.3. The Virtual Hardware Details Window

The virtual hardware details window displays information about the virtual hardware configured for the guest. Virtual hardware resources can be added, removed and modified in this window. To access the virtual hardware details window, click the icon in the toolbar.
The virtual hardware details icon

Figure 19.3. The virtual hardware details icon

Clicking the icon displays the virtual hardware details window.
The virtual hardware details window

Figure 19.4. The virtual hardware details window

19.3.1. Applying Boot Options to Guest Virtual Machines

Using virt-manager you can select how the guest virtual machine will act on boot. The boot options will not take effect until the guest virtual machine reboots. You can either power down the virtual machine before making any changes, or you can reboot the machine afterwards. If you do not do either of these options, the changes will happen the next time the guest reboots.

Procedure 19.1. Configuring boot options

  1. From the Virtual Machine Manager Edit menu, select Virtual Machine Details.
  2. From the side panel, select Boot Options and then complete any or all of the following optional steps:
    1. To indicate that this guest virtual machine should start each time the host physical machine boots, select the Autostart check box.
    2. To indicate the order in which guest virtual machine should boot, click the Enable boot menu check box. After this is checked, you can then check the devices you want to boot from and using the arrow keys change the order that the guest virtual machine will use when booting.
    3. If you want to boot directly from the Linux kernel, expand the Direct kernel boot menu. Fill in the Kernel path, Initrd path, and the Kernel arguments that you want to use.
  3. Click Apply.
    Configuring boot options

    Figure 19.5. Configuring boot options

19.3.2. Attaching USB Devices to a Guest Virtual Machine

Note

In order to attach the USB device to the guest virtual machine, you first must attach it to the host physical machine and confirm that the device is working. If the guest is running, you need to shut it down before proceeding.

Procedure 19.2. Attaching USB devices using Virt-Manager

  1. Open the guest virtual machine's Virtual Machine Details screen.
  2. Click Add Hardware
  3. In the Add New Virtual Hardware popup, select USB Host Device, select the device you want to attach from the list and Click Finish.
    Add USB Device

    Figure 19.6. Add USB Device

  4. To use the USB device in the guest virtual machine, start the guest virtual machine.

19.3.3. USB Redirection

USB re-direction is best used in cases where there is a host physical machine that is running in a data center. The user connects to his/her guest virtual machine from a local machine or thin client. On this local machine there is a SPICE client. The user can attach any USB device to the thin client and the SPICE client will redirect the device to the host physical machine on the data center so it can be used by the VM that is running on the thin client.

Procedure 19.3. Redirecting USB devices

  1. Open the guest virtual machine's Virtual Machine Details screen.
  2. Click Add Hardware
  3. In the Add New Virtual Hardware popup, select USB Redirection. Make sure to select Spice channel from the Type drop-down menu and click Finish.
    Add New Virtual Hardware window

    Figure 19.7. Add New Virtual Hardware window

  4. Open the Virtual Machine menu and select Redirect USB device. A pop-up window opens with a list of USB devices.
    Select a USB device

    Figure 19.8. Select a USB device

  5. Select a USB device for redirection by checking its check box and click OK.

19.4. Virtual Machine Graphical Console

This window displays a guest's graphical console. Guests can use several different protocols to export their graphical frame buffers: virt-manager supports VNC and SPICE. If your virtual machine is set to require authentication, the Virtual Machine graphical console prompts you for a password before the display appears.
Graphical console window

Figure 19.9. Graphical console window

Note

VNC is considered insecure by many security experts, however, several changes have been made to enable the secure usage of VNC for virtualization on Red Hat enterprise Linux. The guest machines only listen to the local host's loopback address (127.0.0.1). This ensures only those with shell privileges on the host can access virt-manager and the virtual machine through VNC. Although virt-manager is configured to listen to other public network interfaces and alternative methods can be configured, it is not recommended.
Remote administration can be performed by tunneling over SSH which encrypts the traffic. Although VNC can be configured to access remotely without tunneling over SSH, for security reasons, it is not recommended. To remotely administer the guest follow the instructions in: Chapter 18, Remote Management of Guests. TLS can provide enterprise level security for managing guest and host systems.
Your local desktop can intercept key combinations (for example, Ctrl+Alt+F1) to prevent them from being sent to the guest machine. You can use the Send key menu option to send these sequences. From the guest machine window, click the Send key menu and select the key sequence to send. In addition, from this menu you can also capture the screen output.
SPICE is an alternative to VNC available for Red Hat Enterprise Linux.

19.5. Adding a Remote Connection

This procedure covers how to set up a connection to a remote system using virt-manager.
  1. To create a new connection open the File menu and select the Add Connection menu item.
  2. The Add Connection wizard appears. Select the hypervisor. For Red Hat Enterprise Linux 7, systems select QEMU/KVM. Select Local for the local system or one of the remote connection options and click Connect. This example uses Remote tunnel over SSH, which works on default installations. For more information on configuring remote connections, see Chapter 18, Remote Management of Guests
    Add Connection

    Figure 19.10. Add Connection

  3. Enter the root password for the selected host when prompted.
A remote host is now connected and appears in the main virt-manager window.
Remote host in the main virt-manager window

Figure 19.11. Remote host in the main virt-manager window

19.6. Displaying Guest Details

You can use the Virtual Machine Monitor to view activity information for any virtual machines on your system.
To view a virtual system's details:
  1. In the Virtual Machine Manager main window, highlight the virtual machine that you want to view.
    Selecting a virtual machine to display

    Figure 19.12. Selecting a virtual machine to display

  2. From the Virtual Machine Manager Edit menu, select Virtual Machine Details.
    When the Virtual Machine details window opens, there may be a console displayed. Should this happen, click View and then select Details. The Overview window opens first by default. To go back to this window, select Overview from the navigation pane on the left-hand side.
    The Overview view shows a summary of configuration details for the guest.
    Displaying guest details overview

    Figure 19.13. Displaying guest details overview

  3. Select CPUs from the navigation pane on the left-hand side. The CPUs view allows you to view or change the current processor allocation.
    It is also possible to increase the number of virtual CPUs (vCPUs) while the virtual machine is running, which is referred to as hot plugging.

    Important

    Hot unplugging vCPUs is not supported in Red Hat Enterprise Linux 7.
    Processor allocation panel

    Figure 19.14. Processor allocation panel

  4. Select Memory from the navigation pane on the left-hand side. The Memory view allows you to view or change the current memory allocation.
    Displaying memory allocation

    Figure 19.15. Displaying memory allocation

  5. Select Boot Options from the navigation pane on the left-hand side. The Boot Options view allows you to view or change the boot options including whether or not the virtual machine starts when the host boots and the boot device order for the virtual machine.
    Displaying boot options

    Figure 19.16. Displaying boot options

  6. Each virtual disk attached to the virtual machine is displayed in the navigation pane. click a virtual disk to modify or remove it.
    Displaying disk configuration

    Figure 19.17. Displaying disk configuration

  7. Each virtual network interface attached to the virtual machine is displayed in the navigation pane. click a virtual network interface to modify or remove it.
    Displaying network configuration

    Figure 19.18. Displaying network configuration

19.7. Managing Snapshots

Using virt-manager, it is possible to create, run, and delete guest snapshots. A snapshot is a saved image of the guest's hard disk, memory, and device state at a single point in time. After a snapshot is created, the guest can be returned to the snapshot's configuration at any time.

Important

Red Hat recommends the use of external snapshots, as they are more flexible and reliable when handled by other virtualization tools. However, it is currently not possible to create external snapshots in virt-manager.
To create external snapshots, use the virsh snapshot-create-as command with the --diskspec vda,snapshot=external option. For more information, see Section A.13, “Workaround for Creating External Snapshots with libvirt”.
  • To manage snapshots in virt-manager, open the snapshot management interface by clicking on the guest console.
  • To create a new snapshot, click under the snapshot list. In the snapshot creation interface, input the name of the snapshot and, optionally, a description, and click Finish.
  • To revert the guest to a snapshot's configuration, select the snapshot and click
  • To remove the selected snapshot, click

Warning

Creating and loading snapshots while the virtual machine is running (also referred to as live snapshots) is only supported with qcow2 disk images.
For more in-depth snapshot management, use the virsh snapshot-create command. See Section 20.39, “Managing Snapshots” for details about managing snapshots with virsh.

Chapter 20. Managing Guest Virtual Machines with virsh

virsh is a command-line interface tool for managing guest virtual machines, and works as the primary means of controlling virtualization on Red Hat Enterprise Linux 7. The virsh command-line tool is built on the libvirt management API, and can be used to create, deploy, and manage guest virtual machines. The virsh utility is ideal for creating virtualization administration scripts, and users without root privileges can use it in read-only mode. The virsh package is installed with yum as part of the libvirt-client package.
For installation instructions, see Section 2.2.1, “Installing Virtualization Packages Manually”. For a general introduction of virsh, including a practical demonstration, see the Virtualization Getting Started Guide The remaining sections of this chapter cover the virsh command set in a logical order based on usage.

Note

Note that when using the help or when reading the man pages, the term 'domain' will be used instead of the term guest virtual machine. This is the term used by libvirt. In cases where the screen output is displayed and the word 'domain' is used, it will not be switched to guest or guest virtual machine. In all examples, the guest virtual machine 'guest1' will be used. You should replace this with the name of your guest virtual machine in all cases. When creating a name for a guest virtual machine you should use a short easy to remember integer (0,1,2...), a text string name, or in all cases you can also use the virtual machine's full UUID.

Important

It is important to note which user you are using. If you create a guest virtual machine using one user, you will not be able to retrieve information about it using another user. This is especially important when you create a virtual machine in virt-manager. The default user is root in that case unless otherwise specified. Should you have a case where you cannot list the virtual machine using the virsh list --all command, it is most likely due to you running the command using a different user than you used to create the virtual machine. See Important for more information.

20.1. Guest Virtual Machine States and Types

Several virsh commands are affected by the state of the guest virtual machine:
  • Transient - A transient guest does not survive reboot.
  • Persistent - A persistent guest virtual machine survives reboot and lasts until it is deleted.
During the life cycle of a virtual machine, libvirt will classify the guest as any of the following states:
  • Undefined - This is a guest virtual machine that has not been defined or created. As such, libvirt is unaware of any guest in this state and will not report about guest virtual machines in this state.
  • Shut off - This is a guest virtual machine which is defined, but is not running. Only persistent guests can be considered shut off. As such, when a transient guest virtual machine is put into this state, it ceases to exist.
  • Running - The guest virtual machine in this state has been defined and is currently working. This state can be used with both persistent and transient guest virtual machines.
  • Paused - The guest virtual machine's execution on the hypervisor has been suspended, or its state has been temporarily stored until it is resumed. Guest virtual machines in this state are not aware they have been suspended and do not notice that time has passed when they are resumed.
  • Saved - This state is similar to the paused state, however the guest virtual machine's configuration is saved to persistent storage. Any guest virtual machine in this state is not aware it is paused and does not notice that time has passed once it has been restored.

20.2. Displaying the virsh Version

The virsh version command displays the current libvirt version and displays information about the local virsh client. For example:
virsh version
Compiled against library: libvirt 1.2.8
Using library: libvirt 1.2.8
Using API: QEMU 1.2.8
Running hypervisor: QEMU 1.5.3
The virsh version --daemon is useful for getting information about the libvirtd version and package information, including information about the libvirt daemon that is running on the host.
virsh version --daemon
Compiled against library: libvirt 1.2.8
Using library: libvirt 1.2.8
Using API: QEMU 1.2.8
Running hypervisor: QEMU 1.5.3
Running against daemon: 1.2.8

20.3. Sending Commands with echo

The virsh echo [--shell][--xml] arguments command displays the specified argument in the specified format. The formats you can use are --shell and --xml. Each argument queried is displayed separated by a space. Tthe --shell option generates output that is formatted in single quotes where needed, so it is suitable for copying and pasting into the bash mode as a command. If the --xml argument is used, the output is formatted for use in an XML file, which can then be saved or used for guest's configuration.

20.4. Connecting to the Hypervisor with virsh Connect

The virsh connect [hostname-or-URI] [--readonly] command begins a local hypervisor session using virsh. After the first time you run this command it will run automatically each time the virsh shell runs. The hypervisor connection URI specifies how to connect to the hypervisor. The most commonly used URIs are:
  • qemu:///system - connects locally as the root user to the daemon supervising guest virtual machines on the KVM hypervisor.
  • qemu:///session - connects locally as a user to the user's set of guest local machines using the KVM hypervisor.
  • lxc:/// - connects to a local Linux container.
The command can be run as follows, with the target guest being specified either either by its machine name (hostname) or the URL of the hypervisor (the output of the virsh uri command), as shown:
$ virsh uri
qemu:///session
For example, to establish a session to connect to your set of guest virtual machines, with you as the local user:
$ virsh connect qemu:///session
To initiate a read-only connection, append the above command with --readonly. For more information on URIs, see Remote URIs. If you are unsure of the URI, the virsh uri command will display it:

20.5. Displaying Information about a Guest Virtual Machine and the Hypervisor

The virsh list command will list guest virtual machines connected to your hypervisor that fit the search parameter requested. The output of the command has 3 columns in a table. Each guest virtual machine is listed with its ID, name, and state.
A wide variety of search parameters is available for virsh list. These options are available on the man page, by running man virsh or by running the virsh list --help command.

Note

Note that this if this command only displays guest virtual machines created by the root user. If it does not display a virtual machine you know you have created, it is probable you did not create the virtual machine as root.
Guests created using the virt-manager interface are by default created by root.

Example 20.1. How to list all locally connected virtual machines

The following example lists all the virtual machines your hypervisor is connected to. Note that this command lists both persistent and transient virtual machines.
virsh list --all

Id              Name                      State
------------------------------------------------
8		guest1			running
22		guest2			paused
35		guest3			shut off
38              guest4			shut off

Example 20.2. How to list the inactive guest virtual machines

The following example lists guests that are currently inactive, or not running. Note that the list only contains persistent virtual machines.
virsh list --inactive

Id              Name                      State
------------------------------------------------
35		guest3			shut off
38		guest4			shut off
In addition, the following commands can also be used to display basic information about the hypervisor:
  • virsh hostname - displays the hypervisor's host name, for example:
    # virsh hostname
    dhcp-2-157.eus.myhost.com
    
  • virsh sysinfo - displays the XML representation of the hypervisor's system information, if available, for example:
    # virsh sysinfo
    <sysinfo type='smbios'>
      <bios>
        <entry name='vendor'>LENOVO</entry>
        <entry name='version'>GJET71WW (2.21 )</entry>
    [...]
    

20.6. Starting, Resuming, and Restoring a Virtual Machine

20.6.1. Starting a Guest Virtual Machine

The virsh start domain; [--console] [--paused] [--autodestroy] [--bypass-cache] [--force-boot] command starts an inactive virtual machine that was already defined but whose state is inactive since its last managed save state or a fresh boot. By default, if the domain was saved by the virsh managedsave command, the domain will be restored to its previous state. Otherwise, it will be freshly booted. The command can take the following arguments and the name of the virtual machine is required.
  • --console - will attach the terminal running virsh to the domain's console device. This is runlevel 3.
  • --paused - if this is supported by the driver, it will start the guest virtual machine in a paused state
  • --autodestroy - the guest virtual machine is automatically destroyed when virsh disconnects
  • --bypass-cache - used if the guest virtual machine is in the managedsave
  • --force-boot - discards any managedsave options and causes a fresh boot to occur

Example 20.3. How to start a virtual machine

The following example starts the guest1 virtual machine that you already created and is currently in the inactive state. In addition, the command attaches the guest's console to the terminal running virsh:
virsh start guest1 --console
Domain guest1 started
Connected to domain guest1
Escape character is ^]

20.6.2. Configuring a Virtual Machine to be Started Automatically at Boot

The virsh autostart [--disable] domain command will automatically start the guest virtual machine when the host machine boots. Adding the --disable argument to this command disables autostart. The guest in this case will not start automatically when the host physical machine boots.

Example 20.4. How to make a virtual machine start automatically when the host physical machine starts

The following example sets the guest1 virtual machine which you already created to autostart when the host boots:
virsh autostart guest1

20.6.3. Rebooting a Guest Virtual Machine

Reboot a guest virtual machine using the virsh reboot domain [--mode modename] command. Remember that this action will only return once it has executed the reboot, so there may be a time lapse from that point until the guest virtual machine actually reboots. You can control the behavior of the rebooting guest virtual machine by modifying the on_reboot element in the guest virtual machine's XML configuration file. By default, the hypervisor attempts to select a suitable shutdown method automatically. To specify an alternative method, the --mode argument can specify a comma separated list which includes initctl, acpi, agent, signal. The order in which drivers will try each mode is undefined, and not related to the order specified in virsh. For strict control over ordering, use a single mode at a time and repeat the command.

Example 20.5. How to reboot a guest virtual machine

The following example reboots a guest virtual machine named guest1. In this example, the reboot uses the initctl method, but you can choose any mode that suits your needs.
virsh reboot guest1 --mode initctl

20.6.4. Restoring a Guest Virtual Machine

The virsh restore <file> [--bypass-cache] [--xml /path/to/file] [--running] [--paused] command restores a guest virtual machine previously saved with the virsh save command. See Section 20.7.1, “Saving a Guest Virtual Machine's Configuration” for information on the virsh save command. The restore action restarts the saved guest virtual machine, which may take some time. The guest virtual machine's name and UUID are preserved, but the ID will not necessarily match the ID that the virtual machine had when it was saved.
The virsh restore command can take the following arguments:
  • --bypass-cache - causes the restore to avoid the file system cache but note that using this flag may slow down the restore operation.
  • --xml - this argument must be used with an XML file name. Although this argument is usually omitted, it can be used to supply an alternative XML file for use on a restored guest virtual machine with changes only in the host-specific portions of the domain XML. For example, it can be used to account for the file naming differences in underlying storage due to disk snapshots taken after the guest was saved.
  • --running - overrides the state recorded in the save image to start the guest virtual machine as running.
  • --paused - overrides the state recorded in the save image to start the guest virtual machine as paused.

Example 20.6. How to restore a guest virtual machine

The following example restores the guest virtual machine and its running configuration file guest1-config.xml:
virsh restore guest1-config.xml --running

20.6.5. Resuming a Guest Virtual Machine

The virsh resume domain command restarts the CPUs of a domain that was suspended. This operation is immediate. The guest virtual machine resumes execution from the point it was suspended. Note that this action will not resume a guest virtual machine that has been undefined. This action will not resume transient virtual machines and will only work on persistent virtual machines.

Example 20.7. How to restore a suspended guest virtual machine

The following example restores the guest1 virtual machine:
virsh resume guest1

20.7. Managing a Virtual Machine Configuration

This section provides information about managing a virtual machine configuration.

20.7.1. Saving a Guest Virtual Machine's Configuration

The virsh save [--bypass-cache] domain file [--xml string] [--running] [--paused] [--verbose] command stops the specified domain, saving the current state of the guest virtual machine's system memory to a specified file. This may take a considerable amount of time, depending on the amount of memory in use by the guest virtual machine. You can restore the state of the guest virtual machine with the virsh restore (Section 20.6.4, “Restoring a Guest Virtual Machine”) command.
The difference between the virsh save command and the virsh suspend command, is that the virsh suspend stops the domain CPUs, but leaves the domain's qemu process running and its memory image resident in the host system. This memory image will be lost if the host system is rebooted.
The virsh save command stores the state of the domain on the hard disk of the host system and terminates the qemu process. This enables restarting the domain from the saved state.
You can monitor the process of virsh save with the virsh domjobinfo command and cancel it with the virsh domjobabort command.
The virsh save command can take the following arguments:
  • --bypass-cache - causes the restore to avoid the file system cache but note that using this flag may slow down the restore operation.
  • --xml - this argument must be used with an XML file name. Although this argument is usually omitted, it can be used to supply an alternative XML file for use on a restored guest virtual machine with changes only in the host-specific portions of the domain XML. For example, it can be used to account for the file naming differences in underlying storage due to disk snapshots taken after the guest was saved.
  • --running - overrides the state recorded in the save image to start the guest virtual machine as running.
  • --paused - overrides the state recorded in the save image to start the guest virtual machine as paused.
  • --verbose - displays the progress of the save.

Example 20.8. How to save a guest virtual machine running configuration

The following example saves the guest1 virtual machine's running configuration to the guest1-config.xml file:
virsh save guest1 guest1-config.xml --running

20.7.2. Defining a Guest Virtual Machine with an XML File

The virsh define filename command defines a guest virtual machine from an XML file. The guest virtual machine definition in this case is registered but not started. If the guest virtual machine is already running, the changes the changes will take effect once the domain is shut down and started again.

Example 20.9. How to create a guest virtual machine from an XML file

The following example creates a virtual machine from the pre-existing guest1-config.xml XML file, which contains the configuration for the virtual machine:
virsh define guest1-config.xml

20.7.3. Updating the XML File That will be Used for Restoring a Guest Virtual Machine

Note

This command should only be used to recover from a situation where the guest virtual machine does not run properly. It is not meant for general use.
The virsh save-image-define filename [--xml /path/to/file] [--running] [--paused] command updates the guest virtual machine's XML file that will be used when the virtual machine is restored used during the virsh restore command. The --xml argument must be an XML file name containing the alternative XML elements for the guest virtual machine's XML. For example, it can be used to account for the file naming differences resulting from creating disk snapshots of underlying storage after the guest was saved. The save image records if the guest virtual machine should be restored to a running or paused state. Using the arguments --running or --paused dictates the state that is to be used.

Example 20.10. How to save the guest virtual machine's running configuration

The following example updates the guest1-config.xml configuration file with the state of the corresponding running guest:
virsh save-image-define guest1-config.xml --running

20.7.4. Extracting the Guest Virtual Machine XML File

Note

This command should only be used to recover from a situation where the guest virtual machine does not run properly. It is not meant for general use.
The virsh save-image-dumpxml file --security-info command will extract the guest virtual machine XML file that was in effect at the time the saved state file (used in the virsh save command) was referenced. Using the --security-info argument includes security sensitive information in the file.

Example 20.11. How to pull the XML configuration from the last save

The following example triggers a dump of the configuration file that was created the last time the guest virtual machine was saved. In this example, the resulting dump file is named guest1-config-xml:
virsh save-image-dumpxml guest1-config.xml

20.7.5. Editing the Guest Virtual Machine Configuration

Note

This command should only be used to recover from a situation where the guest virtual machine does not run properly. It is not meant for general use.
The virsh save-image-edit <file> [--running] [--paused] command edits the XML configuration file that was created by the virsh save command. See Section 20.7.1, “Saving a Guest Virtual Machine's Configuration” for information on the virsh save command.
When the guest virtual machine is saved, the resulting image file will indicate if the virtual machine should be restored to a --running or --paused state. Without using these arguments in the save-image-edit command, the state is determined by the image file itself. By selecting --running (to select the running state) or --paused (to select the paused state) you can overwrite the state that virsh restore should use.

Example 20.12. How to edit a guest virtual machine's configuration and restore the machine to running state

The following example opens a guest virtual machine's configuration file, named guest1-config.xml, for editing in your default editor. When the edits are saved, the virtual machine boots with the new settings:
virsh save-image-edit guest1-config.xml --running

20.8. Shutting off, Shutting down, Rebooting, and Forcing a Shutdown of a Guest Virtual Machine

20.8.1. Shutting down a Guest Virtual Machine

The virsh shutdown domain [--mode modename] command shuts down a guest virtual machine. You can control the behavior of how the guest virtual machine reboots by modifying the on_shutdown parameter in the guest virtual machine's configuration file. Any change to the on_shutdown parameter will only take effect after the domain has been shutdown and restarted.
The virsh shutdown command command can take the following optional argument:
  • --mode chooses the shutdown mode. This can be either acpi, agent, initctl, signal, or paravirt.

Example 20.13. How to shutdown a guest virtual machine

The following example shuts down the guest1 virtual machine using the acpi mode:
virsh shutdown guest1 --mode acpi
Domain guest1 is being shutdown

20.8.2. Suspending a Guest Virtual Machine

The virsh suspend domain command suspends a guest virtual machine.
When a guest virtual machine is in a suspended state, it consumes system RAM but not processor resources. Disk and network I/O does not occur while the guest virtual machine is suspended. This operation is immediate and the guest virtual machine can only be restarted with the virsh resume command. Running this command on a transient virtual machine will delete it.

Example 20.14. How to suspend a guest virtual machine

The following example suspends the guest1 virtual machine:
virsh suspend guest1

20.8.3. Resetting a Virtual Machine

The virsh reset domain resets the guest virtual machine immediately without any guest shutdown. A reset emulates the reset button on a machine, where all guest hardware sees the RST line and re-initializes the internal state. Note that without any guest virtual machine OS shutdown, there are risks for data loss.

Note

Resetting a virtual machine does not apply any pending domain configuration changes. Changes to the domain's configuration only take effect after a complete shutdown and restart of the domain.

Example 20.15. How to reset a guest virtual machine

The following example resets the guest1 virtual machines:
virsh reset guest1

20.8.4. Stopping a Running Guest Virtual Machine in Order to Restart It Later

The virsh managedsave domain --bypass-cache --running | --paused | --verbose command saves and destroys (stops) a running guest virtual machine so that it can be restarted from the same state at a later time. When used with a virsh start command it is automatically started from this save point. If it is used with the --bypass-cache argument the save will avoid the filesystem cache. Note that this option may slow down the save process speed and using the --verbose option displays the progress of the dump process. Under normal conditions, the managed save will decide between using the running or paused state as determined by the state the guest virtual machine is in when the save is done. However, this can be overridden by using the --running option to indicate that it must be left in a running state or by using --paused option which indicates it is to be left in a paused state. To remove the managed save state, use the virsh managedsave-remove command which will force the guest virtual machine to do a full boot the next time it is started. Note that the entire managed save process can be monitored using the domjobinfo command and can also be canceled using the domjobabort command.

Example 20.16. How to stop a running guest and save its configuration

The following example stops the guest1 virtual machine and saves its running configuration setting so that you can restart it:
virsh managedsave guest1 --running

20.9. Removing and Deleting a Virtual Machine

20.9.1. Undefining a Virtual Machine

The virsh undefine domain [--managed-save] [storage] [--remove-all-storage] [--wipe-storage] [--snapshots-metadata] [--nvram] command undefines a domain. If domain is inactive, the configuration is removed completely. If the domain is active (running), it is converted to a transient domain. When the guest virtual machine becomes inactive, the configuration is removed completely.
This command can take the following arguments:
  • --managed-save - this argument guarantees that any managed save image is also cleaned up. Without using this argument, attempts to undefine a guest virtual machine with a managed save will fail.
  • --snapshots-metadata - this argument guarantees that any snapshots (as shown with snapshot-list) are also cleaned up when undefining an inactive guest virtual machine. Note that any attempts to undefine an inactive guest virtual machine with snapshot metadata will fail. If this argument is used and the guest virtual machine is active, it is ignored.
  • --storage - using this argument requires a comma separated list of volume target names or source paths of storage volumes to be removed along with the undefined domain. This action will undefine the storage volume before it is removed. Note that this can only be done with inactive guest virtual machines and that this will only work with storage volumes that are managed by libvirt.
  • --remove-all-storage - in addition to undefining the guest virtual machine, all associated storage volumes are deleted. If you want to delete the virtual machine, choose this option only if there are no other virtual machines using the same associated storage. An alternative way is with the virsh vol-delete. See Section 20.31, “Deleting Storage Volumes” for more information.
  • --wipe-storage - in addition to deleting the storage volume, the contents are wiped.

Example 20.17. How to delete a guest virtual machine and delete its storage volumes

The following example undefines the guest1 virtual machine and remove all associated storage volumes. An undefined guest becomes transient and thus is deleted after it shuts down:
virsh undefine guest1 --remove-all-storage

20.9.2. Forcing a Guest Virtual Machine to Stop

Note

This command should only be used when you cannot shut down the virtual guest machine by any other method.
The virsh destroy command initiates an immediate ungraceful shutdown and stops the specified guest virtual machine. Using virsh destroy can corrupt guest virtual machine file systems. Use the virsh destroy command only when the guest virtual machine is unresponsive. The virsh destroy command with the --graceful option attempts to flush the cache for the disk image file before powering off the virtual machine.

Example 20.18. How to immediately shutdown a guest virtual machine with a hard shutdown

The following example immediately shuts down the guest1 virtual machine, probably because it is unresponsive:
virsh destroy guest1
You may want to follow this with the virsh undefine command. See Example 20.17, “How to delete a guest virtual machine and delete its storage volumes”

20.10. Connecting the Serial Console for the Guest Virtual Machine

The virsh console domain [--devname devicename] [--force] [--safe] command connects the virtual serial console for the guest virtual machine. This is very useful for example for guests that do not provide VNC or SPICE protocols (and thus does not offer video display for GUI tools) and that do not have network connection (and thus cannot be interacted with using SSH).
The optional --devname parameter refers to the device alias of an alternate console, serial, or parallel device configured for the guest virtual machine. If this parameter is omitted, the primary console will be opened. If the --safe option is specified, the connection is only attempted if the driver supports safe console handling. This option specifies that the server has to ensure exclusive access to console devices. Optionally, the force option may be specified, which requests to disconnect any existing sessions, such as in the case of a broken connection.

Example 20.19. How to start a guest virtual machine in console mode

The following example starts a previously created guest1 virtual machine so that it connects to the serial console using safe console handling:
virsh console guest1 --safe

20.11. Injecting Non-maskable Interrupts

The virsh inject-nmi domain injects a non-maskable interrupt (NMI) message to the guest virtual machine. This is used when response time is critical, such as during non-recoverable hardware errors. In addition, virsh inject-nmi is useful for triggering a crashdump in Windows guests.

Example 20.20. How to inject an NMI to the guest virtual machine

The following example sends an NMI to the guest1 virtual machine:
virsh inject-nmi guest1

20.12. Retrieving Information about Your Virtual Machine

20.12.1. Displaying Device Block Statistics

By default, the virsh domblkstat command displays the block statistics for the first block device defined for the domain. To view statistics of other block devices, use the virsh domblklist domain command to list all block devices, and then select a specific block device and display it by specifying either the Target or Source name from the virsh domblklist command output after the domain name. Note that not every hypervisor can display every field. To make sure that the output is presented in its most legible form use the --human argument.

Example 20.21. How to display block statistics for a guest virtual machine

The following example displays the devices that are defined for the guest1 virtual machine, and then lists the block statistics for that device.
# virsh domblklist guest1

Target     Source
------------------------------------------------
vda        /VirtualMachines/guest1.img
hdc        -

# virsh domblkstat guest1 vda --human
Device: vda
 number of read operations:      174670
 number of bytes read:           3219440128
 number of write operations:     23897
 number of bytes written:        164849664
 number of flush operations:     11577
 total duration of reads (ns):   1005410244506
 total duration of writes (ns):  1085306686457
 total duration of flushes (ns): 340645193294

20.12.2. Retrieving Network Interface Statistics

The virsh domifstat domain interface-device command displays the network interface statistics for the specified device running on a given guest virtual machine.
To determine which interface devices are defined for the domain, use the virsh domiflist command and use the output in the Interface column.

Example 20.22. How to display networking statistics for a guest virtual machine

The following example obtains the networking interface defined for the guest1 virtual machine, and then displays the networking statistics on the obtained interface (macvtap0):
virsh domiflist guest1
Interface  Type       Source     Model       MAC
-------------------------------------------------------
macvtap0   direct     em1        rtl8139     12:34:00:0f:8a:4a
# virsh domifstat guest1 macvtap0
macvtap0 rx_bytes 51120
macvtap0 rx_packets 440
macvtap0 rx_errs 0
macvtap0 rx_drop 0
macvtap0 tx_bytes 231666
macvtap0 tx_packets 520
macvtap0 tx_errs 0
macvtap0 tx_drop 0

20.12.5. Setting Network Interface Bandwidth Parameters

The virsh domiftune domain interface-device command either retrieves or sets the specified domain's interface bandwidth parameters. To determine which interface devices are defined for the domain, use the virsh domiflist command and use either the Interface or MAC column as the interface device option. The following format should be used:
# virsh domiftune domain interface [--inbound] [--outbound] [--config] [--live] [--current]
The --config, --live, and --current options are described in Section 20.43, “Setting Schedule Parameters”. If the --inbound or the --outbound option is not specified, virsh domiftune queries the specified network interface and displays the bandwidth settings. By specifying --inbound or --outbound, or both, and the average, peak, and burst values, virsh domiftune sets the bandwidth settings. At minimum the average value is required. In order to clear the bandwidth settings, provide 0 (zero). For a description of the average, peak, and burst values, see Section 20.27.6.2, “Attaching interface devices”.

Example 20.25. How to set the guest virtual machine network interface parameters

The following example sets eth0 parameters for the guest virtual machine named guest1:
virsh domiftune guest1 eth0 outbound --live

20.12.6. Retrieving Memory Statistics

The virsh dommemstat domain [<period in seconds>] [--config] [--live] [--current] command displays the memory statistics for a running guest virtual machine. Using the optional period switch requires a time period in seconds. Setting this option to a value larger than 0 will allow the balloon driver to return additional statistics which will be displayed by running subsequent dommemstat commands. Setting the period option to 0, stops the balloon driver collection but does not clear the statistics already in the balloon driver. You cannot use the --live, --config, or --current options without also setting the period option. If the --live option is specified, only the guest's running statistics will be collected. If the --config option is used, it will collect the statistics for a persistent guest, but only after the next boot. If the --current option is used, it will collect the current statistics.
Both the --live and --config options may be used but --current is exclusive. If no flag is specified, the guest's state will dictate the behavior of the statistics collection (running or not).

Example 20.26. How to collect memory statistics for a running guest virtual machine

The following example shows displaying the memory statistics in the rhel7 domain:
# virsh dommemstat rhel7
actual 1048576
swap_in 0
swap_out 0
major_fault 2974
minor_fault 1272454
unused 246020
available 1011248
rss 865172

20.12.7. Displaying Errors on Block Devices

The virsh domblkerror domain command lists all the block devices in the error state and the error detected on each of them. This command is best used after a