Chapter 6. Deploying SR-IOV technologies

Single root I/O virtualization (SR-IOV) allows near bare metal performance by allowing instances from OpenStack direct access to a shared PCIe resource through virtual resources.

6.1. Prerequisites


Do not manually edit values in /etc/tuned/cpu-partitioning-variables.conf that are modified by Director heat templates.

6.2. Configuring SR-IOV


The CPU assignments, memory allocation and NIC configurations of the following examples may differ from your topology and use case.

  1. Generate the built-in ComputeSriov to define nodes in the OpenStack cluster that will run NeutronSriovAgent, NeutronSriovHostConfig and default compute services.

    # openstack overcloud roles generate \
    -o /home/stack/templates/roles_data.yaml \
    Controller ComputeSriov
  2. Include the neutron-sriov.yaml and roles_data.yaml files when generating overcloud_images.yaml so that SR-IOV containers are prepared.

    openstack overcloud container image prepare \ \
    --push-destination= \
    --prefix=openstack- \
    --tag-from-label {version}-{release} \
    -e ${SERVICES}/neutron-sriov.yaml \
    --roles-file /home/stack/templates/roles_data.yaml \
    --output-env-file=/home/stack/templates/overcloud_images.yaml \

    The push-destination IP address is the address that you previously set with the local_ip parameter in the undercloud.conf configuration file.

    For more information on container image preparation, see Director Installation and Usage.

  3. To apply the KernelAgs and TunedProfile parameters, include the host-config-and-reboot.yaml file from /usr/share/openstack-tripleo-heat-templates/environments to your deployment script.

    openstack overcloud deploy --templates \
    … \
    -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \
  4. Configure the parameters for the SR-IOV nodes under parameter_defaults in accordance with the needs of your cluster, and the configuration of your hardware. These settings are typically added to the network-environment.yaml file.

      NeutronNetworkType: 'vlan'
        - tenant:22:22
        - tenant:25:25
      NeutronTunnelTypes: ''
  5. In the same file, configure role specific parameters for SR-IOV compute nodes.


    The NeutronSriovNumVFs parameter will soon be deprecated in favor of the numvfs attribute in the network configuration templates. Red Hat does not support modification of the NeutronSriovNumVFs parameter, nor the numvfs parameter, after deployment. Changing either parameter within a running environment is known to cause a permanent outage for all running instances which have an SR-IOV port on that PF. Unless you hard reboot these instances, the SR-IOV PCI device will not be visible to the instance.

        IsolCpusList: "1-19,21-39"
        KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=1-19,21-39"
        TunedProfileName: "cpu-partitioning"
          - tenant:br-link0
          - tenant:p7p1
          - tenant:p7p2
          - p7p1:5
          - p7p2:5
          - devname: "p7p1"
             physical_network: "tenant"
          - devname: "p7p2"
            physical_network: "tenant"
        NovaVcpuPinSet: '1-19,21-39'
        NovaReservedHostMemory: 4096
  6. Configure the SR-IOV enabled interfaces in the compute.yaml network configuration template. Ensure the interfaces are configured as standalone NICs for the purposes of creating SR-IOV virtual functions (VFs):

                 - type: interface
                    name: p7p3
                    mtu: 9000
                    use_dhcp: false
                    defroute: false
                    nm_controlled: true
                    hotplug: true
                  - type: interface
                    name: p7p4
                    mtu: 9000
                    use_dhcp: false
                    defroute: false
                    nm_controlled: true
                    hotplug: true
  7. Ensure that the list of default filters includes the value AggregateInstanceExtraSpecsFilter.

    NovaSchedulerDefaultFilters: ['AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','AggregateInstanceExtraSpecsFilter']
  8. Deploy the overcloud.

openstack overcloud deploy --templates \
  -r ${CUSTOM_TEMPLATES}/roles_data.yaml \
  -e ${TEMPLATES_HOME}/environments/host-config-and-reboot.yaml \
  -e ${TEMPLATES_HOME}/environments/services/neutron-sriov.yaml \
  -e ${CUSTOM_TEMPLATES}/network-environment.yaml

6.3. Configuring Hardware Offload (Technology Preview)

Open vSwitch (OVS) hardware offload is a technology preview and not recommended for production deployments. For more information about technology preview features, see Scope of Coverage Details.

OVS hardware offload takes advantage of single root I/O virtualization (SR-IOV), therefore some of the same configuration steps apply.

6.3.1. Open vSwitch hardware offload

To enable OVS hardware offload, complete the following steps.


  1. Generate the ComputeSriov role:

    openstack overcloud roles generate -o roles_data.yaml Controller ComputeSriov
  2. Add the OvsHwOffload parameter under role specific parameters with a value of true.
  3. Configure the physical_network parameter to match your environment.

    • For VLAN, set the physical_network parameter to the name of the network you create in neutron after deployment. This value should also be in NeutronBridgeMappings.
    • For VXLAN, set the physical_network parameter to the string value null.


          IsolCpusList: 2-9,21-29,11-19,31-39
          KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=128 intel_iommu=on iommu=pt"
          OvsHwOffload: true
          TunedProfileName: "cpu-partitioning"
            - tenant:br-tenant
            - tenant:p7p1
            - tenant:p7p2
            - devname: "p7p1"
              physical_network: "null"
            - devname: "p7p2"
              physical_network: "null"
          NovaReservedHostMemory: 4096
          NovaVcpuPinSet: 1-9,21-29,11-19,31-39
  4. Ensure that the list of default filters includes the value NUMATopologyFilter:

      NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']
  5. Configure one or more network interfaces intended for hardware offload in the compute-sriov.yaml configuration file:


    Do not use the NeutronSriovNumVFs parameter when configuring Open vSwitch hardware offload. The number of virtual functions will be specified using the numvfs parameter in a network configuration file used by os-net-config. Red Hat does not support modifying the numvfs setting during update or redeployment.

      - type: ovs_bridge
        name: br-tenant
        mtu: 9000
        - type: sriov_pf
          name: p7p1
          numvfs: 5
          mtu: 9000
          primary: true
          promisc: true
          use_dhcp: false
          link_mode: switchdev

    Do not configure Mellanox network interfaces as a nic-config interface type ovs-vlan because this prevents tunnel endpoints such as VXLAN from passing traffic due to driver limitations.

  6. Include the following files during the deployment of the overcloud:

    • ovs-hw-offload.yaml
    • host-config-and-reboot.yaml

      openstack overcloud deploy --templates \
        -r ${CUSTOME_TEMPLATES}/roles_data.yaml \
        -e ${TEMPLATES_HOME}/environments/ovs-hw-offload.yaml \
        -e ${TEMPLATES_HOME}/environments/host-config-and-reboot.yaml \
        -e ${CUSTOME_TEMPLATES}/network-environment.yaml

6.3.2. Verification

  1. Confirm that a pci device has its mode configured as switchdev:

    # devlink dev eswitch show pci/0000:03:00.0
    pci/0000:03:00.0: mode switchdev inline-mode none encap enable
  2. Confirm offload is enabled in OVS:

    # ovs-vsctl get Open_vSwitch . other_config:hw-offload
  3. Confirm hardware offload is enabled on the NIC:

    # ethtool -k $NIC | grep tc-offload
    hw-tc-offload: on

6.4. Deploying an instance for SR-IOV

It is recommended to use host aggregates to separate high performance compute hosts. For information on creating host aggregates and associated flavors for scheduling see Creating host aggregates.


You should use host aggregates to separate CPU pinned instances from unpinned instances. Instances that do not use CPU pinning do not respect the resourcing requirements of instances that use CPU pinning.

Deploy an instance for single root I/O virtualization (SR-IOV) by performing the following steps:

  1. Create a flavor.

    # openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>
  2. Create the network.

    openstack network create net1 --provider-physical-network tenant --provider-network-type vlan --provider-segment <VLAN-ID>
  3. Create the port.

    • Use vnic-type direct to create an SR-IOV virtual function (VF) port.

      # openstack port create --network net1 --vnic-type direct sriov_port
    • Use the following to create a virtual function with hardware offload.

      openstack port create --network net1 --vnic-type direct --binding-profile '{"capabilities": ["switchdev"]} sriov_hwoffload_port
    • Use vnic-type direct-physical to create an SR-IOV PF port.

      # openstack port create --network net1 --vnic-type direct-physical sriov_port
  4. Deploy an instance

    # openstack server create --flavor <flavor> --image <image> --nic port-id=<id> <instance name>

6.5. Creating host aggregates

It is recommended to deploy guests using cpu pinning and hugepages for increased performance. You can schedule high performance instances on a subset of hosts by matching aggregate metadata with flavor metadata.

6.5.1. Procedure

  1. Ensure that the AggregateInstanceExtraSpecsFilter value is included in the scheduler_default_filters parameter in the nova.conf configuration file. This configuration can be set through the heat parameter NovaSchedulerDefaultFilters under role-specific parameters before deployment.

        NovaSchedulerDefaultFilters: ['AggregateInstanceExtraSpecsFilter', 'RetryFilter','AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']

    This parameter can be added to heat templates and the original deployment script re-run to add this to the configuration of an exiting cluster.

  2. Create an aggregate group for single root I/O virtualization (SR-IOV), and add relevant hosts. Define metadata, for example, sriov=true, that matches defined flavor metadata.

    # openstack aggregate create sriov_group
    # openstack aggregate add host sriov_group compute-sriov-0.localdomain
    # openstack aggregate set sriov_group sriov=true
  3. Create a flavor.

    # openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>
  4. Set additional flavor properties. Note that the defined metadata, sriov=true, matches the defined metadata on the SR-IOV aggregate.

    openstack flavor set --property sriov=true --property hw:cpu_policy=dedicated --property hw:mem_page_size=1GB <flavor>