Chapter 7. Configuring Compute nodes for performance

You can configure the scheduling and placement of instances for optimal performance by creating customized flavors to target specialized workloads, including NFV and High Performance Computing (HPC).

Use the following features to tune your instances for optimal performance:

  • CPU pinning: Pin virtual CPUs to physical CPUs.
  • Emulator threads: Pin emulator threads associated with the instance to physical CPUs.
  • Huge pages: Tune instance memory allocation policies both for normal memory (4k pages) and huge pages (2 MB or 1 GB pages).
Note

Configuring any of these features creates an implicit NUMA topology on the instance if there is no NUMA topology already present.

7.1. Configuring CPU pinning on the Compute node

You can configure instances to run on dedicated host CPUs. Enabling CPU pinning implicitly configures a guest NUMA topology. Each NUMA node of this NUMA topology maps to a separate host NUMA node. For more information about NUMA, see CPUs and NUMA nodes in the Network Functions Virtualization Product Guide.

Configure CPU pinning on your Compute node based on the NUMA topology of your host system. Reserve some CPU cores across all the NUMA nodes for the host processes for efficiency. Assign the remaining CPU cores to managing your instances.

The following example illustrates eight CPU cores spread across two NUMA nodes.

Table 7.1. Example of NUMA Topology

NUMA Node 0

NUMA Node 1

Core 0

Core 1

Core 2

Core 3

Core 4

Core 5

Core 6

Core 7

You can schedule dedicated (pinned) and shared (unpinned) instances on the same Compute node. The following procedure reserves cores 0 and 4 for host processes, cores 1, 3, 5 and 7 for instances that require CPU pinning, and cores 2 and 6 for floating instances that do not require CPU pinning.

Note

If the host supports simultaneous multithreading (SMT), group thread siblings together in either the dedicated or the shared set. Thread siblings share some common hardware which means it is possible for a process running on one thread sibling to impact the performance of the other thread sibling.

For example, the host identifies four CPUs in a dual core CPU with SMT: 0, 1, 2, and 3. Of these four, there are two pairs of thread siblings:

  • Thread sibling 1: CPUs 0 and 2
  • Thread sibling 2: CPUs 1 and 3

In this scenario, you should not assign CPUs 0 and 1 as dedicated and 2 and 3 as shared. Instead, you should assign 0 and 2 as dedicated and 1 and 3 as shared.

Prerequisite

  • You know the NUMA topology of your Compute node. For more information, see Discovering your NUMA node topology in the Network Functions Virtualization Planning and Configuration Guide.

Procedure

  1. Reserve physical CPU cores for the dedicated instances by setting the NovaComputeCpuDedicatedSet configuration in the Compute environment file for each Compute node:

    NovaComputeCpuDedicatedSet: 1,3,5,7
  2. Reserve physical CPU cores for the shared instances by setting the NovaComputeCpuSharedSet configuration in the Compute environment file for each Compute node:

    NovaComputeCpuSharedSet: 2,6
  3. Set the NovaReservedHostMemory option in the same files to the amount of RAM to reserve for host processes. For example, if you want to reserve 512 MB, use:

    NovaReservedHostMemory: 512
  4. To ensure that host processes do not run on the CPU cores reserved for instances, set the parameter IsolCpusList in each Compute environment file to the CPU cores you have reserved for instances. Specify the value of the IsolCpusList parameter using a list, or ranges, of CPU indices separated by a whitespace.

    IsolCpusList: 1 2 3 5 6 7
  5. To filter out hosts based on its NUMA topology, add NUMATopologyFilter to the NovaSchedulerDefaultFilters parameter in each Compute environment file.
  6. To apply this configuration, add the environment file(s) to your deployment command and deploy the overcloud:

    (undercloud) $ openstack overcloud deploy --templates \
      -e [your environment files]
      -e /home/stack/templates/<compute_environment_file>.yaml

7.1.1. Upgrading CPU pinning configuration

From Red Hat OpenStack Platform (RHOSP) 16+ it is not necessary to use host aggregates to ensure dedicated (pinned) and shared (unpinned) instance types run on separate hosts. Also, the [DEFAULT] reserved_host_cpus config option is no longer necessary and can be unset.

To upgrade your CPU pinning configuration from earlier versions of RHOSP:

  • Migrate the value of NovaVcpuPinSet to NovaComputeCpuDedicatedSet for hosts that were previously used for pinned instances.
  • Migrate the value of NovaVcpuPinSet to NovaComputeCpuSharedSet for hosts that were previously used for unpinned instances.
  • If there is no value set for NovaVcpuPinSet, then all host cores should be assigned to either NovaComputeCpuDedicatedSet or NovaComputeCpuSharedSet, depending on the type of instance running there.

Once the upgrade is complete, it is possible to start setting both options on the same host. However, to do this, all the instances should be migrated from the host, as the Compute service cannot start when cores for an unpinned instance are not listed in NovaComputeCpuSharedSet, or when cores for a pinned instance are not listed in NovaComputeCpuDedicatedSet.

7.1.2. Launching an instance with CPU pinning

You can launch an instance that uses CPU pinning by specifying a flavor with a dedicated CPU policy.

Prerequisites

Procedure

  1. Create a flavor for instances that require CPU pinning:

    (overcloud) $ openstack flavor create --ram <size-mb> --disk <size-gb> --vcpus <no_reserved_vcpus> pinned_cpus
  2. To request pinned CPUs, set the hw:cpu_policy property of the flavor to dedicated:

    (overcloud) $ openstack flavor set --property hw:cpu_policy=dedicated pinned_cpus
  3. To place each vCPU on thread siblings, set the hw:cpu_thread_policy property of the flavor to require:

    (overcloud) $ openstack flavor set --property hw:cpu_thread_policy=require pinned_cpus
    Note
    • If the host does not have an SMT architecture or enough CPU cores with available thread siblings, scheduling will fail. To prevent this, set hw:cpu_thread_policy to prefer instead of require. The (default) prefer policy ensures that thread siblings are used when available.
    • If you use cpu_thread_policy=isolate, you must have SMT disabled or use a platform that does not support SMT.
  4. Create an instance using the new flavor:

    (overcloud) $ openstack server create --flavor pinned_cpus --image <image> pinned_cpu_instance
  5. To verify correct placement of the new instance, run the following command and check for OS-EXT-SRV-ATTR:hypervisor_hostname in the output:

    (overcloud) $ openstack server show pinned_cpu_instance

7.1.3. Launching a floating instance

You can launch an instance that is placed on a floating CPU by specifying a flavor with a shared CPU policy.

Prerequisites

Procedure

  1. Create a flavor for instances that do not require CPU pinning:

    (overcloud) $ openstack flavor create --ram <size-mb> --disk <size-gb> --vcpus <no_reserved_vcpus> floating_cpus
  2. To request floating CPUs, set the hw:cpu_policy property of the flavor to shared:

    (overcloud) $ openstack flavor set --property hw:cpu_policy=shared floating_cpus
  3. Create an instance using the new flavor:

    (overcloud) $ openstack server create --flavor floating_cpus --image <image> floating_cpu_instance
  4. To verify correct placement of the new instance, run the following command and check for OS-EXT-SRV-ATTR:hypervisor_hostname in the output:

    (overcloud) $ openstack server show floating_cpu_instance

7.2. Configuring huge pages on the Compute node

Configure the Compute node to enable instances to request huge pages.

Procedure

  1. Configure the amount of huge page memory to reserve on each NUMA node for processes that are not instances:

    parameter_defaults:
      NovaReservedHugePages: ["node:0,size:2048,count:64","node:1,size:1GB,count:1"]

    Where:

    Attribute

    Description

    size

    The size of the allocated huge page. Valid values: * 2048 (for 2MB) * 1GB

    count

    The number of huge pages used by OVS per NUMA node. For example, for 4096 of socket memory used by Open vSwitch, set this to 2.

  2. (Optional) To allow instances to allocate 1GB huge pages, configure the CPU feature flags, cpu_model_extra_flags, to include "pdpe1gb":

    parameter_defaults:
       ComputeExtraConfig:
         nova::compute::libvirt::libvirt_cpu_mode: 'custom'
         nova::compute::libvirt::libvirt_cpu_model: 'Haswell-noTSX'
         nova::compute::libvirt::libvirt_cpu_model_extra_flags: 'vmx, pdpe1gb'
    Note
    • CPU feature flags do not need to be configured to allow instances to only request 2 MB huge pages.
    • You can only allocate 1G huge pages to an instance if the host supports 1G huge page allocation.
    • You only need to set cpu_model_extra_flags to pdpe1gb when cpu_mode is set to host-model or custom.
    • If the host supports pdpe1gb, and host-passthrough is used as the cpu_mode, then you do not need to set pdpe1gb as a cpu_model_extra_flags. The pdpe1gb flag is only included in Opteron_G4 and Opteron_G5 CPU models, it is not included in any of the Intel CPU models supported by QEMU.
    • To mitigate for CPU hardware issues, such as Microarchitectural Data Sampling (MDS), you might need to configure other CPU flags. For more information, see RHOS Mitigation for MDS ("Microarchitectural Data Sampling") Security Flaws.
  3. To avoid loss of performance after applying Meltdown protection, configure the CPU feature flags, cpu_model_extra_flags, to include "+pcid":

    parameter_defaults:
       ComputeExtraConfig:
         nova::compute::libvirt::libvirt_cpu_mode: 'custom'
         nova::compute::libvirt::libvirt_cpu_model: 'Haswell-noTSX'
         nova::compute::libvirt::libvirt_cpu_model_extra_flags: 'vmx, pdpe1gb, +pcid'
  4. Add NUMATopologyFilter to the NovaSchedulerDefaultFilters parameter in each Compute environment file, if not already present.
  5. Apply this huge page configuration by adding the environment file(s) to your deployment command and deploying the overcloud:

    (undercloud) $ openstack overcloud deploy --templates \
      -e [your environment files]
      -e /home/stack/templates/<compute_environment_file>.yaml

7.2.1. Allocating huge pages to instances

Create a flavor with the hw:mem_page_size extra specification key to specify that the instance should use huge pages.

Prerequisites

Procedure

  1. Create a flavor for instances that require huge pages:

    $ openstack flavor create --ram <size-mb> --disk <size-gb> --vcpus <no_reserved_vcpus> huge_pages
  2. Set the flavor for huge pages:

    $ openstack flavor set huge_pages --property hw:mem_page_size=1GB

    Valid values for hw:mem_page_size:

    • large - Selects the largest page size supported on the host, which may be 2 MB or 1 GB on x86_64 systems.
    • small - (Default) Selects the smallest page size supported on the host. On x86_64 systems this is 4 kB (normal pages).
    • any - Selects the largest available huge page size, as determined by the libvirt driver.
    • <pagesize>: (string) Set an explicit page size if the workload has specific requirements. Use an integer value for the page size in KB, or any standard suffix. For example: 4KB, 2MB, 2048, 1GB.
  3. Create an instance using the new flavor:

    $ openstack server create --flavor huge_pages --image <image> huge_pages_instance

Validation

The scheduler identifies a host with enough free huge pages of the required size to back the memory of the instance. If the scheduler is unable to find a host and NUMA node with enough pages, then the request will fail with a NoValidHost error.