Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Chapter 6. Planning Your OVS-DPDK Deployment

To optimize your OVS-DPDK deployment for NFV, you should understand how OVS-DPDK uses the Compute node hardware (CPU, NUMA nodes, memory, NICs) and the considerations for determining the individual OVS-DPDK parameters based on your Compute node.

See NFV Performance Considerations for a high-level introduction to CPUs and NUMA topology.

6.1. How OVS-DPDK Uses CPU Partitioning and NUMA Topology

OVS-DPDK partitions the hardware resources for host, guests, and OVS-DPDK itself. The OVS-DPDK Poll Mode Drivers (PMDs) run DPDK active loops, which require dedicated cores. This means a list of CPUs and Huge Pages are dedicated to OVS-DPDK.

A sample partitioning includes 16 cores per NUMA node on dual socket Compute nodes. The traffic requires additional NICs since the NICs cannot be shared between the host and OVS-DPDK.

OpenStack NFV NUMA 9 0219
Note

DPDK PMD threads must be reserved on both NUMA nodes even if a NUMA node does not have an associated DPDK NIC.

OVS-DPDK performance also depends on reserving a block of memory local to the NUMA node. Use NICs associated with the same NUMA node that you use for memory and CPU pinning. Also ensure both interfaces in a bond are from NICs on the same NUMA node.

6.2. Understanding OVS-DPDK Parameters

This section describes how OVS-DPDK uses parameters within the director network_environment.yaml HEAT templates to configure the CPU and memory for optimum performance. Use this information to evaluate the hardware support on your Compute nodes and how best to partition that hardware to optimize your OVS-DPDK deployment.

Note

Assign sibling threads together when allocating logical CPUs to a given task.

See Discovering Your NUMA Node Topology to determine the CPU and NUMA nodes on your Compute nodes. You use this information to map CPU and other parameters to support the host, guest instance, and OVS-DPDK process needs.

6.2.1. CPU Parameters

OVS-DPDK uses the following CPU partitioning parameters:

NeutronDpdkCoreList

Provides the CPU cores that are used for the DPDK poll mode drivers (PMD). Choose CPU cores that are associated with the local NUMA nodes of the DPDK interfaces. NeutronDpdkCoreList is used for the pmd-cpu-mask value in Open vSwitch.

  • Pair the sibling threads together.
  • Exclude all cores from the HostCpusList
  • Avoid allocating the logical CPUs (both thread siblings) of the first physical core on both NUMA nodes as these should be used for the HostCpusList parameter.
  • Performance depends on the number of physical cores allocated for this PMD Core list. On the NUMA node which is associated with DPDK NIC, allocate the required cores.
  • For NUMA nodes with a DPDK NIC:

    • Determine the number of physical cores required based on the performance requirement and include all the sibling threads (logical CPUs) for each physical core.
  • For NUMA nodes without DPDK NICs:

    • Allocate the sibling threads (logical CPUs) of one physical core (excluding the first physical core of the NUMA node). You need a minimal DPDK poll mode driver on the NUMA node even without DPDK NICs present to avoid failures in creating guest instances.
Note

DPDK PMD threads must be reserved on both NUMA nodes even if a NUMA node does not have an associated DPDK NIC.

NovaVcpuPinSet

Sets cores for CPU pinning. The Compute node uses these cores for guest instances. NovaVcpuPinSet is used as the vcpu_pin_set value in the nova.conf file.

  • Exclude all cores from the NeutronDpdkCoreList and the HostCpusList.
  • Include all remaining cores.
  • Pair the sibling threads together.
HostIsolatedCoreList

A set of CPU cores isolated from the host processes. This parameter is used as the isolated_cores value in the cpu-partitioning-variable.conf file for the tuned-profiles-cpu-partitioning component.

  • Match the list of cores in NeutronDpdkCoreList and NovaVcpuPinSet.
  • Pair the sibling threads together.
HostCpusList

Provides CPU cores for non-datapath OVS-DPDK processes, such as handler and revalidator threads. This parameter has no impact on overall data path performance on multi-NUMA node hardware. This parameter is used for the dpdk-lcore-mask value in Open vSwitch and the cores are shared with the host OS.

  • Allocate the first physical core (and sibling thread) from each NUMA node (even if the NUMA node has no associated DPDK NIC).
  • These cores must be mutually exclusive from the list of cores in NeutronDpdkCoreList and NovaVcpuPinSet.

6.2.2. Memory Parameters

OVS-DPDK uses the following memory parameters:

NovaReservedHostMemory

Reserves memory in MB for tasks on the host. This value is used by the Compute node as the reserved_host_memory_mb value in nova.conf.

  • Use the static recommended value of 4096 MB.
NeutronDpdkSocketMemory

Specifies the amount of memory in MB to pre-allocate from the hugepage pool, per NUMA node, for DPDK NICs. This value is used by Open vSwitch as the other_config:dpdk-socket-mem value.

  • Provide as a comma-separated list. The NeutronDpdkSocketMemory value is calculated from the MTU value of each DPDK NIC on the NUMA node.
  • Round each MTU value to the nearest 1024 bytes (ROUNDUP_PER_MTU).
  • For a NUMA node without a DPDK NIC, use the static recommendation of 1024 MB (1GB)
  • The following equation approximates the value for NeutronDpdkSocketMemory:

    • MEMORY_REQD_PER_MTU = (ROUNDUP_PER_MTU + 800) * (4096 * 64) Bytes

      • 800 is the overhead value
      • 4096 * 64 is the number of packets in the mempool
  • Add the MEMORY_REQD_PER_MTU for each of the MTU values set on the NUMA node and add another 512 MB as buffer. Round the value up to a multiple of 1024.

Sample Calculation - MTU 2000 and MTU 9000

DPDK NICs dpdk0 and dpdk1 are on the same NUMA node 0 and configured with MTUs 9000 and 2000 respectively. The sample calculation to derive the memory required is as follows:

  1. Round off the MTU values to the nearest 1024 bytes.

    The MTU value of 9000 becomes 9216 bytes.
    The MTU value of 2000 becomes 2048 bytes.
  2. Calculate the required memory for each MTU value based on these rounded byte values.

    Memory required for 9000 MTU = (9216 + 800) * (4096*64) = 2625634304
    Memory required for 2000 MTU = (2048 + 800) * (4096*64) = 746586112
  3. Calculate the combined total memory required, in bytes.

    2625634304 + 746586112 + 536870912 = 3909091328 bytes.

    This calculation represents (Memory required for MTU of 9000) + (Memory required for MTU of 2000) + (512 MB buffer).

  4. Convert the total memory required into MB.

    3909091328 / (1024*1024) = 3728 MB.
  5. Round this value up to the nearest 1024.

    3724 MB rounds up to 4096 MB.
  6. Use this value to set NeutronDpdkSocketMemory.

        NeutronDpdkSocketMemory: “4096,1024”

Sample Calculation - MTU 2000

DPDK NICs dpdk0 and dpdk1 are on the same NUMA node 0 and configured with MTUs 2000 and 2000 respectively. The sample calculation to derive the memory required is as follows:

  1. Round off the MTU values to the nearest 1024 bytes.

    The MTU value of 2000 becomes 2048 bytes.
  2. Calculate the required memory for each MTU value based on these rounded byte values.

    Memory required for 2000 MTU = (2048 + 800) * (4096*64) = 746586112
  3. Calculate the combined total memory required, in bytes.

    746586112 + 536870912 = 1283457024 bytes.

    This calculation represents (Memory required for MTU of 2000) + (512 MB buffer).

  4. Convert the total memory required into MB.

    1283457024 / (1024*1024) = 1224 MB.
  5. Round this value up to the nearest 1024.

    1224 MB rounds up to 2048 MB.
  6. Use this value to set NeutronDpdkSocketMemory.

        NeutronDpdkSocketMemory: “2048,1024”

6.2.3. Networking Parameters

NeutronDpdkDriverType
Sets the driver type used by DPDK. Use the default of vfio-pci.
NeutronDatapathType
Datapath type for OVS bridges. DPDK uses the default value of netdev.
NeutronVhostuserSocketDir
Sets the vhost-user socket directory for OVS. Use /var/run/openvswitch for vhost server mode.

6.2.4. Other Parameters

NovaSchedulerDefaultFilters
Provides an ordered list of filters that the Compute node uses to find a matching Compute node for a requested guest instance.
ComputeKernelArgs

Provides multiple kernel arguments to /etc/default/grub for the Compute node at boot time. Add the following based on your configuration:

  • hugepagesz: Sets the size of the hugepages on a CPU. This value can vary depending on the CPU hardware. Set to 1G for OVS-DPDK deployments (default_hugepagesz=1GB hugepagesz=1G). Check for the pdpe1gb CPU flag to ensure your CPU supports 1G.

    lshw -class processor | grep pdpe1gb
  • hugepages count: Sets the number of hugepages available. This value depends on the amount of host memory available. Use most of your available memory (excluding NovaReservedHostMemory). You must also configure the hugepages count value within the OpenStack flavor associated with your Compute nodes.
  • iommu: For Intel CPUs, add “intel_iommu=on iommu=pt”`
  • isolcpus: Sets the CPU cores to be tuned. This value matches HostIsolatedCoreList.

6.3. Two NUMA Node Example OVS-DPDK Deployment

This sample Compute node includes two NUMA nodes as follows:

  • NUMA 0 has cores 0-7. The sibling thread pairs are (0,1), (2,3), (4,5), and (6,7).
  • NUMA 1 has cores 8-15. The sibling thread pairs are (8,9), (10,11), (12,13), and (14,15).
  • Each NUMA node connects to a physical NIC (NIC1 on NUMA 0 and NIC2 on NUMA 1).
OpenStack NFV NUMA Nodes 453316 0717 ECE OVS DPDK Deployment
Note

Reserve the first physical cores (both thread pairs) on each NUMA node (0,1 and 8,9) for non-datapath DPDK processes (HostCpusList).

This example also assumes a 1500 MTU configuration, so the OvsDpdkSocketMemory is the same for all use cases:

OvsDpdkSocketMemory: “1024,1024”

NIC 1 for DPDK, with one physical core for PMD

In this use case, we allocate one physical core on NUMA 0 for PMD. We must also allocate one physical core on NUMA 1, even though there is no DPDK enabled on the NIC for that NUMA node. The remaining cores (not reserved for HostCpusList) are allocated for guest instances. The resulting parameter settings are:

NeutronDpdkCoreList: “'2,3,10,11'”
NovaVcpuPinSet: “4,5,6,7,12,13,14,15”

NIC 1 for DPDK, with two physical cores for PMD

In this use case, we allocate two physical cores on NUMA 0 for PMD. We must also allocate one physical core on NUMA 1, even though there is no DPDK enabled on the NIC for that NUMA node. The remaining cores (not reserved for HostCpusList) are allocated for guest instances. The resulting parameter settings are:

NeutronDpdkCoreList: “'2,3,4,5,10,11'”
NovaVcpuPinSet: “6,7,12,13,14,15”

NIC 2 for DPDK, with one physical core for PMD

In this use case, we allocate one physical core on NUMA 1 for PMD. We must also allocate one physical core on NUMA 0, even though there is no DPDK enabled on the NIC for that NUMA node. The remaining cores (not reserved for HostCpusList) are allocated for guest instances. The resulting parameter settings are:

NeutronDpdkCoreList: “'2,3,10,11'”
NovaVcpuPinSet: “4,5,6,7,12,13,14,15”

NIC 2 for DPDK, with two physical cores for PMD

In this use case, we allocate two physical cores on NUMA 1 for PMD. We must also allocate one physical core on NUMA 0, even though there is no DPDK enabled on the NIC for that NUMA node. The remaining cores (not reserved for HostCpusList) are allocated for guest instances. The resulting parameter settings are:

NeutronDpdkCoreList: “'2,3,10,11,12,13'”
NovaVcpuPinSet: “4,5,6,7,14,15”

NIC 1 and NIC2 for DPDK, with two physical cores for PMD

In this use case, we allocate two physical cores on each NUMA node for PMD. The remaining cores (not reserved for HostCpusList) are allocated for guest instances. The resulting parameter settings are:

NeutronDpdkCoreList: “'2,3,4,5,10,11,12,13'”
NovaVcpuPinSet: “6,7,14,15”
Note

Red Hat recommends using 1 physical core per NUMA node.

6.4. Topology of an NFV OVS-DPDK Deployment

This sample OVS-DPDK deployment consists of two VNFs each with two interfaces, namely, the management interface represented by mgt and the dataplane interface. In the OVS-DPDK deployment, the VNFs run with inbuilt DPDK that supports the physical interface. OVS-DPDK takes care of the bonding at the vSwitch level. In an OVS-DPDK deployment, it is recommended that you do not mix kernel and OVS-DPDK NICs as it can lead to performance degradation. To separate the management (mgt) network, connected to the Base provider network for the virtual machine, you need to ensure you have additional NICs. The Compute node consists of two regular NICs for the OpenStack API management that can be reused by the Ceph API but cannot be shared with any OpenStack tenant.

NFV OVS-DPDK deployment

NFV OVS-DPDK Topology

The following image shows the topology for OVS_DPDK for the NFV use case. It consists of Compute and Controller nodes with 1 or 10 Gbps NICs, and the Director node.

NFV OVS-DPDK Topology