Red Hat Training
A Red Hat training course is available for Red Hat OpenStack Platform
Chapter 6. Deploying SR-IOV technologies
Single root I/O virtualization (SR-IOV) allows near bare metal performance by allowing instances from OpenStack direct access to a shared PCIe resource through virtual resources.
6.1. Prerequisites
- Install and configure the undercloud before deploying the overcloud. See the Director Installation and Usage Guide for details.
Do not manually edit values in /etc/tuned/cpu-partitioning-variables.conf
that are modified by Director heat templates.
6.2. Configuring SR-IOV
The CPU assignments, memory allocation and NIC configurations of the following examples may differ from your topology and use case.
Generate the built-in
ComputeSriov
to define nodes in the OpenStack cluster that will runNeutronSriovAgent
,NeutronSriovHostConfig
and default compute services.# openstack overcloud roles generate \ -o /home/stack/templates/roles_data.yaml \ Controller ComputeSriov
Include the
neutron-sriov.yaml
androles_data.yaml
files when generatingovercloud_images.yaml
so that SR-IOV containers are prepared.SERVICES=\ /usr/share/openstack-tripleo-heat-templates/environments/services openstack overcloud container image prepare \ --namespace=registry.redhat.io/rhosp13 \ --push-destination=192.168.24.1:8787 \ --prefix=openstack- \ --tag-from-label {version}-{release} \ -e ${SERVICES}/neutron-sriov.yaml \ --roles-file /home/stack/templates/roles_data.yaml \ --output-env-file=/home/stack/templates/overcloud_images.yaml \ --output-images-file=/home/stack/local_registry_images.yaml
NoteThe push-destination IP address is the address that you previously set with the
local_ip
parameter in theundercloud.conf
configuration file.For more information on container image preparation, see Director Installation and Usage.
To apply the
KernelAgs
andTunedProfile
parameters, include thehost-config-and-reboot.yaml
file from/usr/share/openstack-tripleo-heat-templates/environments
to your deployment script.openstack overcloud deploy --templates \ … \ -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \ ...
Configure the parameters for the SR-IOV nodes under
parameter_defaults
in accordance with the needs of your cluster, and the configuration of your hardware. These settings are typically added to thenetwork-environment.yaml
file.NeutronNetworkType: 'vlan' NeutronNetworkVLANRanges: - tenant:22:22 - tenant:25:25 NeutronTunnelTypes: ''
In the same file, configure role specific parameters for SR-IOV compute nodes.
NoteThe
NeutronSriovNumVFs
parameter will soon be deprecated in favor of thenumvfs
attribute in the network configuration templates. Red Hat does not support modification of theNeutronSriovNumVFs
parameter, nor thenumvfs
parameter, after deployment. Changing either parameter within a running environment is known to cause a permanent outage for all running instances which have an SR-IOV port on that PF. Unless you hard reboot these instances, the SR-IOV PCI device will not be visible to the instance.ComputeSriovParameters: IsolCpusList: "1-19,21-39" KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=1-19,21-39" TunedProfileName: "cpu-partitioning" NeutronBridgeMappings: - tenant:br-link0 NeutronPhysicalDevMappings: - tenant:p7p1 - tenant:p7p2 NeutronSriovNumVFs: - p7p1:5 - p7p2:5 NovaPCIPassthrough: - vendor_id: "8086" product_id: "1528" address: "0000:06:00.0" physical_network: "tenant" - vendor_id: "8086" product_id: "1528" address: "0000:06:00.1" physical_network: "tenant" NovaVcpuPinSet: '1-19,21-39' NovaReservedHostMemory: 4096
NoteDo not use the
devname
parameter when configuring PCI passthrough, as the device name of a NIC can change. Instead, usevendor_id
andproduct_id
because they are more stable, or use theaddress
of the NIC. For more information about how to configureNovaPCIPassthrough
, see Guidelines for configuringNovaPCIPassthrough
.Configure the SR-IOV enabled interfaces in the
compute.yaml
network configuration template. Ensure the interfaces are configured as standalone NICs for the purposes of creating SR-IOV virtual functions (VFs):- type: interface name: p7p3 mtu: 9000 use_dhcp: false defroute: false nm_controlled: true hotplug: true - type: interface name: p7p4 mtu: 9000 use_dhcp: false defroute: false nm_controlled: true hotplug: true
Ensure that the list of default filters includes the value
AggregateInstanceExtraSpecsFilter
.NovaSchedulerDefaultFilters: ['AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','AggregateInstanceExtraSpecsFilter']
- Deploy the overcloud.
TEMPLATES_HOME="/usr/share/openstack-tripleo-heat-templates" CUSTOM_TEMPLATES="/home/stack/templates" openstack overcloud deploy --templates \ -r ${CUSTOM_TEMPLATES}/roles_data.yaml \ -e ${TEMPLATES_HOME}/environments/host-config-and-reboot.yaml \ -e ${TEMPLATES_HOME}/environments/services/neutron-sriov.yaml \ -e ${CUSTOM_TEMPLATES}/network-environment.yaml
6.3. Configuring Hardware Offload (Technology Preview)
Open vSwitch (OVS) hardware offload is a technology preview and not recommended for production deployments. For more information about technology preview features, see Scope of Coverage Details.
The procedure for OVS hardware offload configuration shares many of the same steps as configuring SR-IOV.
Procedure
Generate the
ComputeSriov
role:openstack overcloud roles generate -o roles_data.yaml Controller ComputeSriov
-
Add the
OvsHwOffload
parameter under role-specific parameters with a value oftrue
. -
To configure neutron to use the iptables/hybrid firewall driver implementation, include the line:
NeutronOVSFirewallDriver: iptables_hybrid
. For more information aboutNeutronOVSFirewallDriver
, see Using the Open vSwitch Firewall in the Advanced Overcloud Customization Guide. Configure the
physical_network
parameter to match your environment.-
For VLAN, set the
physical_network
parameter to the name of the network you create in neutron after deployment. This value should also be inNeutronBridgeMappings
. For VXLAN, set the
physical_network
parameter tonull
.Example:
parameter_defaults: NeutronOVSFirewallDriver: iptables_hybrid ComputeSriovParameters: IsolCpusList: 2-9,21-29,11-19,31-39 KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=128 intel_iommu=on iommu=pt" OvsHwOffload: true TunedProfileName: "cpu-partitioning" NeutronBridgeMappings: - tenant:br-tenant NovaPCIPassthrough: - vendor_id: <vendor-id> product_id: <product-id> address: <address> physical_network: "tenant" - vendor_id: <vendor-id> product_id: <product-id> address: <address> physical_network: "null" NovaReservedHostMemory: 4096 NovaComputeCpuDedicatedSet: 1-9,21-29,11-19,31-39
-
Replace
<vendor-id>
with the vendor ID of the physical NIC. -
Replace
<product-id>
with the product ID of the NIC VF. Replace
<address>
with the address of the physical NIC.For more information about how to configure
NovaPCIPassthrough
, see Guidelines for configuringNovaPCIPassthrough
.
-
For VLAN, set the
Ensure that the list of default filters includes
NUMATopologyFilter
:NovaSchedulerDefaultFilters: [\'RetryFilter',\'AvailabilityZoneFilter',\'ComputeFilter',\'ComputeCapabilitiesFilter',\'ImagePropertiesFilter',\'ServerGroupAntiAffinityFilter',\'ServerGroupAffinityFilter',\'PciPassthroughFilter',\'NUMATopologyFilter']
Configure one or more network interfaces intended for hardware offload in the
compute-sriov.yaml
configuration file:- type: ovs_bridge name: br-tenant mtu: 9000 members: - type: sriov_pf name: p7p1 numvfs: 5 mtu: 9000 primary: true promisc: true use_dhcp: false link_mode: switchdev
Note-
Do not use the
NeutronSriovNumVFs
parameter when configuring Open vSwitch hardware offload. The number of virtual functions is specified using thenumvfs
parameter in a network configuration file used byos-net-config
. Red Hat does not support modifying thenumvfs
setting during update or redeployment. -
Do not configure Mellanox network interfaces as a nic-config interface type
ovs-vlan
because this prevents tunnel endpoints such as VXLAN from passing traffic due to driver limitations.
-
Do not use the
Include the
ovs-hw-offload.yaml
file in theovercloud deploy
command:TEMPLATES_HOME=”/usr/share/openstack-tripleo-heat-templates” CUSTOM_TEMPLATES=”/home/stack/templates” openstack overcloud deploy --templates \ -r ${CUSTOM_TEMPLATES}/roles_data.yaml \ -e ${TEMPLATES_HOME}/environments/ovs-hw-offload.yaml \ -e ${CUSTOM_TEMPLATES}/network-environment.yaml \ -e ${CUSTOM_TEMPLATES}/neutron-ovs.yaml
6.3.1. Verifying OVS hardware offload
Confirm that a PCI device is in
switchdev
mode:# devlink dev eswitch show pci/0000:03:00.0 pci/0000:03:00.0: mode switchdev inline-mode none encap enable
Verify if offload is enabled in OVS:
# ovs-vsctl get Open_vSwitch . other_config:hw-offload “true”
Confirm hardware offload is enabled on the NIC:
# ethtool -k $NIC | grep tc-offload hw-tc-offload: on
6.4. Deploying an instance for SR-IOV
It is recommended to use host aggregates to separate high performance compute hosts. For information on creating host aggregates and associated flavors for scheduling see Creating host aggregates.
You should use host aggregates to separate CPU pinned instances from unpinned instances. Instances that do not use CPU pinning do not respect the resourcing requirements of instances that use CPU pinning.
Deploy an instance for single root I/O virtualization (SR-IOV) by performing the following steps:
Create a flavor.
# openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>
Create the network.
# openstack network create net1 --provider-physical-network tenant --provider-network-type vlan --provider-segment <VLAN-ID> # openstack subnet create subnet1 --network net1 --subnet-range 192.0.2.0/24 --dhcp
Create the port.
Use vnic-type
direct
to create an SR-IOV virtual function (VF) port.# openstack port create --network net1 --vnic-type direct sriov_port
Use the following to create a virtual function with hardware offload.
# openstack port create --network net1 --vnic-type direct --binding-profile '{"capabilities": ["switchdev"]} sriov_hwoffload_port
Use vnic-type
direct-physical
to create an SR-IOV PF port.# openstack port create --network net1 --vnic-type direct-physical sriov_port
Deploy an instance
# openstack server create --flavor <flavor> --image <image> --nic port-id=<id> <instance name>
6.5. Creating host aggregates
Deploy guests using cpu pinning and hugepages for increased performance. You can schedule high performance instances on a subset of hosts by matching aggregate metadata with flavor metadata.
Procedure
You can configure the
AggregateInstanceExtraSpecsFilter
value, and other necessary filters, through the heat parameterNovaSchedulerDefaultFilters
underparameter_defaults
in thenova.conf
configuration file before deployment.parameter_defaults: NovaSchedulerDefaultFilters: ['AggregateInstanceExtraSpecsFilter', 'RetryFilter','AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']
NoteTo add the
AggregateInstanceExtraSpecsFilter
configuration to an exiting cluster, you can add this parameter to the heat templates, and run the original deployment script again.Create an aggregate group for single root I/O virtualization (SR-IOV), and add relevant hosts. Define metadata, for example,
sriov=true
, that matches defined flavor metadata.# openstack aggregate create sriov_group # openstack aggregate add host sriov_group compute-sriov-0.localdomain # openstack aggregate set --property sriov=true sriov_group
Create a flavor.
# openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>
Set additional flavor properties. Note that the defined metadata,
sriov=true
, matches the defined metadata on the SR-IOV aggregate.openstack flavor set --property sriov=true --property hw:cpu_policy=dedicated --property hw:mem_page_size=1GB <flavor>