Chapter 5. Configuring SR-IOV and DPDK interfaces on the same compute node
This section describes how to deploy SR-IOV and DPDK interfaces on the same Compute node.
This guide provides examples for CPU assignments, memory allocation, and NIC configurations that may vary from your topology and use case. See the Network Functions Virtualization Product Guide and the Network Functions Virtualization Planning Guide to understand the hardware and configuration options.
The process to create and deploy SR-IOV and DPDK interfaces on the same Compute node includes:
-
Set the parameters for SR-IOV role and OVS-DPDK in the
network_environment.yamlfile. -
Configure the
compute.yamlfile with an SR-IOV interface and a DPDK interface. - Deploy the overcloud with this updated set of roles.
- Create the appropriate OpenStack flavor, networks, and ports to support these interface types.
We recommend the following network settings:
- Use floating IP addresses for the guest instances.
- Create a router and attach it to the DPDK VXLAN network (the management network).
- Use SR-IOV for the provider network.
-
Boot the guest instance with two ports attached. We recommend you use
cloud-initfor the guest instance to set the default route for the management network. - Add the floating IP address to booted guest instance.
If needed, use SR-IOV bonding for the guest instance and ensure both SR-IOV interfaces exist on the same NUMA node for optimum performance.
You must install and configure the undercloud before you can deploy the compute node in the overcloud. See the Director Installation and Usage Guide for details.
Ensure that you create an OpenStack flavor that match this custom role.
5.1. Modifying the first-boot.yaml file
Modify the first-boot.yaml file to set up OVS and DPDK parameters and to configure tuned for CPU affinity.
Add additional resources.
resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: set_ovs_config} - config: {get_resource: set_dpdk_params} - config: {get_resource: install_tuned} - config: {get_resource: compute_kernel_args}Set the OVS Configuration.
set_ovs_config: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then if [ -f /usr/lib/systemd/system/openvswitch-nonetwork.service ]; then ovs_service_path="/usr/lib/systemd/system/openvswitch-nonetwork.service" elif [ -f /usr/lib/systemd/system/ovs-vswitchd.service ]; then ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service" fi grep -q "RuntimeDirectoryMode=.*" $ovs_service_path if [ "$?" -eq 0 ]; then sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path else echo "RuntimeDirectoryMode=0775" >> $ovs_service_path fi grep -Fxq "Group=qemu" $ovs_service_path if [ ! "$?" -eq 0 ]; then echo "Group=qemu" >> $ovs_service_path fi grep -Fxq "UMask=0002" $ovs_service_path if [ ! "$?" -eq 0 ]; then echo "UMask=0002" >> $ovs_service_path fi ovs_ctl_path='/usr/share/openvswitch/scripts/ovs-ctl' grep -q "umask 0002 \&\& start_daemon \"\$OVS_VSWITCHD_PRIORITY\"" $ovs_ctl_path if [ ! "$?" -eq 0 ]; then sed -i 's/start_daemon \"\$OVS_VSWITCHD_PRIORITY.*/umask 0002 \&\& start_daemon \"$OVS_VSWITCHD_PRIORITY\" \"$OVS_VSWITCHD_WRAPPER\" \"$@\"/' $ovs_ctl_path fi fi params: $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat}Set the DPDK parameters.
set_dpdk_params: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash set -x get_mask() { local list=$1 local mask=0 declare -a bm max_idx=0 for core in $(echo $list | sed 's/,/ /g') do index=$(($core/32)) bm[$index]=0 if [ $max_idx -lt $index ]; then max_idx=$(($index)) fi done for ((i=$max_idx;i>=0;i--)); do bm[$i]=0 done for core in $(echo $list | sed 's/,/ /g') do index=$(($core/32)) temp=$((1<<$(($core % 32)))) bm[$index]=$((${bm[$index]} | $temp)) done printf -v mask "%x" "${bm[$max_idx]}" for ((i=$max_idx-1;i>=0;i--)); do printf -v hex "%08x" "${bm[$i]}" mask+=$hex done printf "%s" "$mask" } FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then pmd_cpu_mask=$( get_mask $PMD_CORES ) host_cpu_mask=$( get_mask $LCORE_LIST ) socket_mem=$(echo $SOCKET_MEMORY | sed s/\'//g ) ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=$socket_mem ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=$pmd_cpu_mask ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=$host_cpu_mask fi params: $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat} $LCORE_LIST: {get_param: HostCpusList} $PMD_CORES: {get_param: NeutronDpdkCoreList} $SOCKET_MEMORY: {get_param: NeutronDpdkSocketMemory}Set the
tunedconfiguration to provide CPU affinity.install_tuned: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then # Install the tuned package yum install -y tuned-profiles-cpu-partitioning tuned_conf_path="/etc/tuned/cpu-partitioning-variables.conf" if [ -n "$TUNED_CORES" ]; then grep -q "^isolated_cores" $tuned_conf_path if [ "$?" -eq 0 ]; then sed -i 's/^isolated_cores=.*/isolated_cores=$TUNED_CORES/' $tuned_conf_path else echo "isolated_cores=$TUNED_CORES" >> $tuned_conf_path fi tuned-adm profile cpu-partitioning fi fi params: $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat} $TUNED_CORES: {get_param: HostIsolatedCoreList}Set the kernel arguments.
compute_kernel_args: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then sed 's/^\(GRUB_CMDLINE_LINUX=".*\)"/\1 $KERNEL_ARGS isolcpus=$TUNED_CORES"/g' -i /etc/default/grub ; grub2-mkconfig -o /etc/grub2.cfg reboot fi params: $KERNEL_ARGS: {get_param: ComputeKernelArgs} $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat} $TUNED_CORES: {get_param: HostIsolatedCoreList}
5.2. Configuring tuned for CPU affinity
This example uses the sample post-install.yaml file.
Set the
tunedconfiguration to enable CPU affinity.resources: ExtraDeployments: type: OS::Heat::StructuredDeployments properties: servers: {get_param: servers} config: {get_resource: ExtraConfig} # Do this on CREATE/UPDATE (which is actually the default) actions: ['CREATE', 'UPDATE'] ExtraConfig: type: OS::Heat::SoftwareConfig properties: group: script config: str_replace: template: | #!/bin/bash set -x FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then tuned_service=/usr/lib/systemd/system/tuned.service grep -q "network.target" $tuned_service if [ "$?" -eq 0 ]; then sed -i '/After=.*/s/network.target//g' $tuned_service fi grep -q "Before=.*network.target" $tuned_service if [ ! "$?" -eq 0 ]; then grep -q "Before=.*" $tuned_service if [ "$?" -eq 0 ]; then sed -i 's/^\(Before=.*\)/\1 network.target openvswitch.service/g' $tuned_service else sed -i '/After/i Before=network.target openvswitch.service' $tuned_service fi fi systemctl daemon-reload fi params: $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat}
5.3. Defining the SR-IOV and OVS-DPDK parameters
Modify the network-environment.yaml file to configure SR-IOV and OVS-DPDK role-specific parameters:
Add the resource mapping for the OVS-DPDK and SR-IOV services to the
network-environment.yamlfile along with the network configuration for these nodes:resource_registry: # Specify the relative/absolute path to the config files you want to use for override the default. OS::TripleO::Compute::Net::SoftwareConfig: nic-configs/compute.yaml OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yaml OS::TripleO::NodeUserData: first-boot.yaml OS::TripleO::NodeExtraConfigPost: post-install.yamlDefine the flavors:
OvercloudControlFlavor: controller OvercloudComputeFlavor: compute
Define the tunnel type:
# The tunnel type for the tenant network (vxlan or gre). Set to '' to disable tunneling. NeutronTunnelTypes: 'vxlan' # The tenant network type for Neutron (vlan or vxlan). NeutronNetworkType: 'vlan'
Configure the parameters for SR-IOV:
NeutronSupportedPCIVendorDevs: ['8086:154d', '8086:10ed'] NovaPCIPassthrough: - devname: "ens2f1" physical_network: "tenant" NeutronPhysicalDevMappings: "tenant:ens2f1" NeutronSriovNumVFs: "ens2f1:5" NeutronEnableIsolatedMetadata: true NeutronEnableForceMetadata: true # Global MTU. NeutronGlobalPhysnetMtu: 9000 # Configure the classname of the firewall driver to use for implementing security groups. NeutronOVSFirewallDriver: openvswitchConfigure the parameters for OVS-DPDK:
######################## # OVS DPDK configuration ## NeutronDpdkCoreList and NeutronDpdkMemoryChannels are REQUIRED settings. ## Attempting to deploy DPDK without appropriate values will cause deployment to fail or lead to unstable deployments. # List of cores to be used for DPDK Poll Mode Driver NeutronDpdkCoreList: "'1,17,9,25'" # Number of memory channels to be used for DPDK NeutronDpdkMemoryChannels: "4" # NeutronDpdkSocketMemory NeutronDpdkSocketMemory: "'1024,1024'" # NeutronDpdkDriverType NeutronDpdkDriverType: "vfio-pci" # The vhost-user socket directory for OVS NeutronVhostuserSocketDir: "/var/run/openvswitch" ######################## # Additional settings ######################## # Reserved RAM for host processes NovaReservedHostMemory: 2048 # A list or range of physical CPU cores to reserve for virtual machine processes. # Example: NovaVcpuPinSet: ['4-12','^8'] will reserve cores from 4-12 excluding 8 NovaVcpuPinSet: "2,3,4,5,6,7,18,19,20,21,22,23,10,11,12,13,14,15,26,27,28,29,30,31" # An array of filters used by Nova to filter a node.These filters will be applied in the order they are listed, # so place your most restrictive filters first to make the filtering process more efficient. NovaSchedulerDefaultFilters: "RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter,NUMATopologyFilter" # Kernel arguments for Compute node ComputeKernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on" # A list or range of physical CPU cores to be tuned. # The given args will be appended to the tuned cpu-partitioning profile. HostIsolatedCoreList: "1,2,3,4,5,6,7,9,10,17,18,19,20,21,22,23,11,12,13,14,15,25,26,27,28,29,30,31" # List of logical cores to be used by ovs-dpdk processess (dpdk-lcore-mask) HostCpusList: "'0,16,8,24'"
NoteYou must assign at least one CPU (with sibling thread) on each NUMA node with or without DPDK NICs present for DPDK PMD to avoid failures in creating guest instances.
-
Configure the remainder of the
network-environment.yamlfile to override the default parameters from theneutron-ovs-dpdk-agent.yamlandneutron-sriov-agent.yamlfiles as needed for your OpenStack deployment.
See the Network Functions Virtualization Planning Guide for details on how to determine the best values for the OVS-DPDK parameters that you set in the network-environment.yaml file to optimize your OpenStack network for OVS-DPDK.
5.4. Configuring the Compute node for SR-IOV and DPDK interfaces
This example uses the sample the compute.yaml file to support SR-IOV and DPDK interfaces.
Create the control plane Linux bond for an isolated network:
type: linux_bond name: bond_api bonding_options: "mode=active-backup" use_dhcp: false dns_servers: {get_param: DnsServers} members: - type: interface name: nic3 primary: true - type: interface name: nic4Assign VLANs to this Linux bond:
type: vlan vlan_id: {get_param: InternalApiNetworkVlanID} device: bond_api addresses: - ip_netmask: {get_param: InternalApiIpSubnet}Set a bridge with a DPDK port to link to the controller:
type: ovs_user_bridge name: br-link0 ovs_extra: - str_replace: template: set port br-link0 tag=_VLAN_TAG_ params: _VLAN_TAG_: {get_param: TenantNetworkVlanID} addresses: - ip_netmask: {get_param: TenantIpSubnet} use_dhcp: false members: - type: ovs_dpdk_port name: dpdk0 mtu: 9000 ovs_extra: - set interface $DEVICE mtu_request=$MTU members: - type: interface name: nic5 primary: trueNoteTo include multiple DPDK devices, repeat the
typecode section for each DPDK device you want to add.NoteWhen using OVS-DPDK, all bridges on the same Compute node should be of type
ovs_user_bridge. The director may accept the configuration, but Red Hat OpenStack Platform does not support mixingovs_bridgeandovs_user_bridgeon the same node.Create the SR-IOV interface to the Controller:
- type: interface name: ens2f1 mtu: 9000 use_dhcp: false defroute: false nm_controlled: true hotplug: true
5.5. Deploying the overcloud
The following example defines the overcloud_deploy.sh Bash script that deploys both OVS-DPDK and SR-IOV:
#!/bin/bash openstack overcloud deploy \ --templates \ -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-sriov.yaml \ -e /home/stack/ospd-10-vxlan-vlan-dpdk-sriov-ctlplane-bonding/network-environment.yaml
5.6. Creating a flavor and deploying an instance with SR-IOV and DPDK interfaces
After you have completed configuring SR-IOV and DPDK interfaces on the same compute node, you need to create a flavor and deploy an instance by performing the following steps:
Create a flavor:
# openstack flavor create --vcpus 6 --ram 4096 --disk 40 compute
Where:
-
computeis the flavor name. -
4096is the memory size in MB. -
40is the disk size in GB (default 0G). -
6is the number of vCPUs.
-
Set the flavor for large pages:
# openstack flavor set compute --property hw:mem_page_size=1GB
Create the external network:
# openstack network create --external external
Create the networks for SR-IOV and DPDK:
# openstack network create --name net-dpdk # openstack network create --name net-sriov # openstack subnet create --subnet-range <cidr/prefix> --network net-dpdk net-dpdk-subnet # openstack subnet create --subnet-range <cidr/prefix> --network net-sriov net-sriov-subnet
Create the SR-IOV port.
Use
vnic-typedirect to create an SR-IOV VF port:# openstack port create --network net-sriov --vnic-type direct sriov_port
Use
vnic-typedirect-physical to create an SR-IOV PF port:# openstack port create --network net-sriov --vnic-type direct-physical sriov_port
Create a router and attach to the DPDK VXLAN network:
# openstack router create router1 # openstack router add subnet router1 net-dpdk-subnet
Create a floating IP address and associate it with the guest instance port:
# openstack floating ip create --floating-ip-address FLOATING-IP externalDeploy an instance:
# openstack server create --flavor compute --image rhel_7.3 --nic port-id=sriov_port --nic net-id=NET_DPDK_ID vm1
Where:
- compute is the flavor name or ID.
-
rhel_7.3is the image (name or ID) used to create an instance. -
sriov_portis the name of the port created in the previous step. - NET_DPDK_ID is the DPDK network ID.
-
vm1is the name of the instance.
You have now deployed an instance that uses an SR-IOV interface and a DPDK interface on the same Compute node.
For instances with more interfaces, you can use cloud-init. See Table 3.1 in Create an Instance for details.
