Chapter 3. Configuring Compute nodes for performance

As a cloud administrator, you can configure the scheduling and placement of instances for optimal performance by creating customized flavors to target specialized workloads, including NFV and High Performance Computing (HPC).

Use the following features to tune your instances for optimal performance:

  • CPU pinning: Pin virtual CPUs to physical CPUs.
  • Emulator threads: Pin emulator threads associated with the instance to physical CPUs.
  • Huge pages: Tune instance memory allocation policies both for normal memory (4k pages) and huge pages (2 MB or 1 GB pages).
Note

Configuring any of these features creates an implicit NUMA topology on the instance if there is no NUMA topology already present.

3.1. Configuring CPU pinning on Compute nodes

You can configure each instance CPU process to run on a dedicated host CPU by enabling CPU pinning on the Compute nodes. When an instance uses CPU pinning, each instance vCPU process is allocated its own host pCPU that no other instance vCPU process can use. Instances that run on Compute nodes with CPU pinning enabled have a NUMA topology. Each NUMA node of the instance NUMA topology maps to a NUMA node on the host Compute node.

You can configure the Compute scheduler to schedule instances with dedicated (pinned) CPUs and instances with shared (floating) CPUs on the same Compute node. To configure CPU pinning on Compute nodes that have a NUMA topology, you must complete the following:

  1. Designate Compute nodes for CPU pinning.
  2. Configure the Compute nodes to reserve host cores for pinned instance vCPU processes, floating instance vCPU processes, and host processes.
  3. Deploy the overcloud.
  4. Create a flavor for launching instances that require CPU pinning.
  5. Create a flavor for launching instances that use shared, or floating, CPUs.

3.1.1. Prerequisites

  • You know the NUMA topology of your Compute node.

3.1.2. Designating Compute nodes for CPU pinning

To designate Compute nodes for instances with pinned CPUs, you must create a new role file to configure the CPU pinning role, and configure a new overcloud flavor and CPU pinning resource class to use to tag the Compute nodes for CPU pinning.

Procedure

  1. Log in to the undercloud as the stack user.
  2. Source the stackrc file:

    [stack@director ~]$ source ~/stackrc
  3. Generate a new roles data file named roles_data_cpu_pinning.yaml that includes the Controller, Compute, and ComputeCPUPinning roles:

    (undercloud)$ openstack overcloud roles \
     generate -o /home/stack/templates/roles_data_cpu_pinning.yaml \
     Compute:ComputeCPUPinning Compute Controller
  4. Open roles_data_cpu_pinning.yaml and edit or add the following parameters and sections:

    Section/ParameterCurrent valueNew value

    Role comment

    Role: Compute

    Role: ComputeCPUPinning

    Role name

    name: Compute

    name: ComputeCPUPinning

    description

    Basic Compute Node role

    CPU Pinning Compute Node role

    HostnameFormatDefault

    %stackname%-novacompute-%index%

    %stackname%-novacomputepinning-%index%

    deprecated_nic_config_name

    compute.yaml

    compute-cpu-pinning.yaml

  5. Register the CPU pinning Compute nodes for the overcloud by adding them to your node definition template, node.json or node.yaml. For more information, see Registering nodes for the overcloud in the Director Installation and Usage guide.
  6. Inspect the node hardware:

    (undercloud)$ openstack overcloud node introspect \
     --all-manageable --provide

    For more information, see the relevant section in the Director Installation and Usage guide:

  7. Create the compute-cpu-pinning overcloud flavor for CPU pinning Compute nodes:

    (undercloud)$ openstack flavor create --id auto \
     --ram <ram_size_mb> --disk <disk_size_gb> \
     --vcpus <no_vcpus> compute-cpu-pinning
    • Replace <ram_size_mb> with the RAM of the bare metal node, in MB.
    • Replace <disk_size_gb> with the size of the disk on the bare metal node, in GB.
    • Replace <no_vcpus> with the number of CPUs on the bare metal node.

      Note

      These properties are not used for scheduling instances. However, the Compute scheduler does use the disk size to determine the root partition size.

  8. Tag each bare metal node that you want to designate for CPU pinning with a custom CPU pinning resource class:

    (undercloud)$ openstack baremetal node set \
     --resource-class baremetal.CPU-PINNING <node>

    Replace <node> with the ID of the bare metal node.

  9. Associate the compute-cpu-pinning flavor with the custom CPU pinning resource class:

    (undercloud)$ openstack flavor set \
     --property resources:CUSTOM_BAREMETAL_CPU_PINNING=1 \
     compute-cpu-pinning

    To determine the name of a custom resource class that corresponds to a resource class of a Bare Metal service node, convert the resource class to uppercase, replace each punctuation mark with an underscore, and prefix with CUSTOM_.

    Note

    A flavor can request only one instance of a bare metal resource class.

  10. Set the following flavor properties to prevent the Compute scheduler from using the bare metal flavor properties to schedule instances:

    (undercloud)$ openstack flavor set \
     --property resources:VCPU=0 \
     --property resources:MEMORY_MB=0 \
     --property resources:DISK_GB=0 compute-cpu-pinning
  11. Optional: If the network topology of the ComputeCPUPinning role is different from the network topology of your Compute role, then create a custom network interface template. For more information, see Custom network interface templates in the Advanced Overcloud Customization guide.

    If the network topology of the ComputeCPUPinning role is the same as the Compute role, then you can use the default network topology defined in compute.yaml.

  12. Register the Net::SoftwareConfig of the ComputeCPUPinning role in your network-environment.yaml file:

    resource_registry:
      OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml
      OS::TripleO::ComputeCPUPinning::Net::SoftwareConfig: /home/stack/templates/nic-configs/<cpu_pinning_net_top>.yaml
      OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml

    Replace <cpu_pinning_net_top> with the name of the file that contains the network topology of the ComputeCPUPinning role, for example, compute.yaml to use the default network topology.

  13. Add the following parameters to the node-info.yaml file to specify the number of CPU pinning Compute nodes, and the flavor to use for the CPU pinning designated Compute nodes:

    parameter_defaults:
      OvercloudComputeCPUPinningFlavor: compute-cpu-pinning
      ComputeCPUPinningCount: 3
  14. To verify that the role was created, enter the following command:

    (undercloud)$ openstack baremetal node list --long -c "UUID" \
     -c "Instance UUID" -c "Resource Class" -c "Provisioning State" \
     -c "Power State" -c "Last Error" -c "Fault" -c "Name" -f json

    Example output:

    [
      {
        "Fault": null,
        "Instance UUID": "e8e60d37-d7c7-4210-acf7-f04b245582ea",
        "Last Error": null,
        "Name": "compute-0",
        "Power State": "power on",
        "Provisioning State": "active",
        "Resource Class": "baremetal.CPU-PINNING",
        "UUID": "b5a9ac58-63a7-49ba-b4ad-33d84000ccb4"
      },
      {
        "Fault": null,
        "Instance UUID": "3ec34c0b-c4f5-4535-9bd3-8a1649d2e1bd",
        "Last Error": null,
        "Name": "compute-1",
        "Power State": "power on",
        "Provisioning State": "active",
        "Resource Class": "compute",
        "UUID": "432e7f86-8da2-44a6-9b14-dfacdf611366"
      },
      {
        "Fault": null,
        "Instance UUID": "4992c2da-adde-41b3-bef1-3a5b8e356fc0",
        "Last Error": null,
        "Name": "controller-0",
        "Power State": "power on",
        "Provisioning State": "active",
        "Resource Class": "controller",
        "UUID": "474c2fc8-b884-4377-b6d7-781082a3a9c0"
      }
    ]

3.1.3. Configuring Compute nodes for CPU pinning

Configure CPU pinning on your Compute nodes based on the NUMA topology of the nodes. Reserve some CPU cores across all the NUMA nodes for the host processes for efficiency. Assign the remaining CPU cores to managing your instances.

This procedure uses the following NUMA topology, with eight CPU cores spread across two NUMA nodes, to illustrate how to configure CPU pinning:

Table 3.1. Example of NUMA Topology

NUMA Node 0

NUMA Node 1

Core 0

Core 1

Core 2

Core 3

Core 4

Core 5

Core 6

Core 7

The procedure reserves cores 0 and 4 for host processes, cores 1, 3, 5 and 7 for instances that require CPU pinning, and cores 2 and 6 for floating instances that do not require CPU pinning.

Procedure

  1. Create an environment file to configure Compute nodes to reserve cores for pinned instances, floating instances, and host processes, for example, cpu_pinning.yaml.
  2. To schedule instances with a NUMA topology on NUMA-capable Compute nodes, add NUMATopologyFilter to the NovaSchedulerDefaultFilters parameter in your Compute environment file, if not already present:

    parameter_defaults:
      NovaSchedulerDefaultFilters: ['AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']

    For more information on NUMATopologyFilter, see Compute scheduler filters .

  3. To reserve physical CPU cores for the dedicated instances, add the following configuration to cpu_pinning.yaml:

    parameter_defaults:
      ComputeCPUPinningParameters:
        NovaComputeCpuDedicatedSet: 1,3,5,7
  4. To reserve physical CPU cores for the shared instances, add the following configuration to cpu_pinning.yaml:

    parameter_defaults:
      ComputeCPUPinningParameters:
        ...
        NovaComputeCpuSharedSet: 2,6
  5. To specify the amount of RAM to reserve for host processes, add the following configuration to cpu_pinning.yaml:

    parameter_defaults:
      ComputeCPUPinningParameters:
        ...
        NovaReservedHostMemory: <ram>

    Replace <ram> with the amount of RAM to reserve in MB.

  6. To ensure that host processes do not run on the CPU cores reserved for instances, set the parameter IsolCpusList to the CPU cores you have reserved for instances:

    parameter_defaults:
      ComputeCPUPinningParameters:
        ...
        IsolCpusList: 1-3,5-7

    Specify the value of the IsolCpusList parameter using a list, or ranges, of CPU indices separated by a comma.

  7. Add your new role and environment files to the stack with your other environment files and deploy the overcloud:

    (undercloud)$ openstack overcloud deploy --templates \
      -e [your environment files] \
      -r /home/stack/templates/roles_data_cpu_pinning.yaml \
      -e /home/stack/templates/network-environment.yaml \
      -e /home/stack/templates/cpu_pinning.yaml \
      -e /home/stack/templates/node-info.yaml

3.1.4. Creating a dedicated CPU flavor for instances

To enable your cloud users to create instances that have dedicated CPUs, you can create a flavor with a dedicated CPU policy for launching instances.

Prerequisites

Procedure

  1. Source the overcloudrc file:

    (undercloud)$ source ~/overcloudrc
  2. Create a flavor for instances that require CPU pinning:

    (overcloud)$ openstack flavor create --ram <size_mb> \
     --disk <size_gb> --vcpus <no_reserved_vcpus> pinned_cpus
  3. To request pinned CPUs, set the hw:cpu_policy property of the flavor to dedicated:

    (overcloud)$ openstack flavor set \
     --property hw:cpu_policy=dedicated pinned_cpus
  4. To place each vCPU on thread siblings, set the hw:cpu_thread_policy property of the flavor to require:

    (overcloud)$ openstack flavor set \
     --property hw:cpu_thread_policy=require pinned_cpus
    Note
    • If the host does not have an SMT architecture or enough CPU cores with available thread siblings, scheduling fails. To prevent this, set hw:cpu_thread_policy to prefer instead of require. The prefer policy is the default policy that ensures that thread siblings are used when available.
    • If you use hw:cpu_thread_policy=isolate, you must have SMT disabled or use a platform that does not support SMT.

Verification

  1. To verify the flavor creates an instance with dedicated CPUs, use your new flavor to launch an instance:

    (overcloud)$ openstack server create --flavor pinned_cpus \
     --image <image> pinned_cpu_instance
  2. To verify correct placement of the new instance, enter the following command and check for OS-EXT-SRV-ATTR:hypervisor_hostname in the output:

    (overcloud)$ openstack server show pinned_cpu_instance

3.1.5. Creating a shared CPU flavor for instances

To enable your cloud users to create instances that use shared, or floating, CPUs, you can create a flavor with a shared CPU policy for launching instances.

Prerequisites

Procedure

  1. Source the overcloudrc file:

    (undercloud)$ source ~/overcloudrc
  2. Create a flavor for instances that do not require CPU pinning:

    (overcloud)$ openstack flavor create --ram <size_mb> \
     --disk <size_gb> --vcpus <no_reserved_vcpus> floating_cpus
  3. To request floating CPUs, set the hw:cpu_policy property of the flavor to shared:

    (overcloud)$ openstack flavor set \
     --property hw:cpu_policy=shared floating_cpus

Verification

  1. To verify the flavor creates an instance that uses the shared CPUs, use your new flavor to launch an instance:

    (overcloud)$ openstack server create --flavor floating_cpus \
     --image <image> floating_cpu_instance
  2. To verify correct placement of the new instance, enter the following command and check for OS-EXT-SRV-ATTR:hypervisor_hostname in the output:

    (overcloud)$ openstack server show floating_cpu_instance

3.1.6. Configuring CPU pinning on Compute nodes with simultaneous multithreading (SMT)

If a Compute node supports simultaneous multithreading (SMT), group thread siblings together in either the dedicated or the shared set. Thread siblings share some common hardware which means it is possible for a process running on one thread sibling to impact the performance of the other thread sibling.

For example, the host identifies four logical CPU cores in a dual core CPU with SMT: 0, 1, 2, and 3. Of these four, there are two pairs of thread siblings:

  • Thread sibling 1: logical CPU cores 0 and 2
  • Thread sibling 2: logical CPU cores 1 and 3

In this scenario, do not assign logical CPU cores 0 and 1 as dedicated and 2 and 3 as shared. Instead, assign 0 and 2 as dedicated and 1 and 3 as shared.

The files /sys/devices/system/cpu/cpuN/topology/thread_siblings_list, where N is the logical CPU number, contain the thread pairs. You can use the following command to identify which logical CPU cores are thread siblings:

# grep -H . /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | sort -n -t ':' -k 2 -u

The following output indicates that logical CPU core 0 and logical CPU core 2 are threads on the same core:

/sys/devices/system/cpu/cpu0/topology/thread_siblings_list:0,2
/sys/devices/system/cpu/cpu2/topology/thread_siblings_list:1,3

3.1.7. Additional resources

3.2. Configuring emulator threads

Compute nodes have overhead tasks associated with the hypervisor for each instance, known as emulator threads. By default, emulator threads run on the same CPUs as the instance, which impacts the performance of the instance.

You can configure the emulator thread policy to run emulator threads on separate CPUs to those the instance uses.

Note

To avoid packet loss, you must never preempt the vCPUs in an NFV deployment.

Procedure

  1. Log in to the undercloud as the stack user.
  2. Open your Compute environment file.
  3. To reserve physical CPU cores for instances that require CPU pinning, configure the NovaComputeCpuDedicatedSet parameter in the Compute environment file. For example, the following configuration sets the dedicated CPUs on a Compute node with a 32-core CPU:

    parameter_defaults:
      ...
      NovaComputeCpuDedicatedSet: 2-15,18-31
      ...

    For more information, see Configuring CPU pinning on the Compute nodes.

  4. To reserve physical CPU cores for the emulator threads, configure the NovaComputeCpuSharedSet parameter in the Compute environment file. For example, the following configuration sets the shared CPUs on a Compute node with a 32-core CPU:

    parameter_defaults:
      ...
      NovaComputeCpuSharedSet: 0,1,16,17
      ...
    Note

    The Compute scheduler also uses the CPUs in the shared set for instances that run on shared, or floating, CPUs. For more information, see Configuring CPU pinning on Compute nodes

  5. Add the Compute scheduler filter NUMATopologyFilter to the NovaSchedulerDefaultFilters parameter, if not already present.
  6. Add your Compute environment file to the stack with your other environment files and deploy the overcloud:

    (undercloud)$ openstack overcloud deploy --templates \
     -e [your environment files] \
     -e /home/stack/templates/<compute_environment_file>.yaml
  7. Configure a flavor that runs emulator threads for the instance on a dedicated CPU, which is selected from the shared CPUs configured using NovaComputeCpuSharedSet:

    (overcloud)$ openstack flavor set --property hw:cpu_policy=dedicated \
     --property hw:emulator_threads_policy=share \
     dedicated_emulator_threads

    For more information about configuration options for hw:emulator_threads_policy, see Emulator threads policy in Flavor metadata.

3.3. Configuring huge pages on Compute nodes

As a cloud administrator, you can configure Compute nodes to enable instances to request huge pages.

Procedure

  1. Open your Compute environment file.
  2. Configure the amount of huge page memory to reserve on each NUMA node for processes that are not instances:

    parameter_defaults:
      NovaReservedHugePages: ["node:0,size:2048,count:64","node:1,size:1GB,count:1"]
    • Replace the size value for each node with the size of the allocated huge page. Set to one of the following valid values:

      • 2048 (for 2MB)
      • 1GB
    • Replace the count value for each node with the number of huge pages used by OVS per NUMA node. For example, for 4096 of socket memory used by Open vSwitch, set this to 2.
  3. Optional: To allow instances to allocate 1GB huge pages, configure the CPU feature flags, NovaLibvirtCPUModelExtraFlags, to include pdpe1gb:

    parameter_defaults:
      ComputeParameters:
        NovaLibvirtCPUMode: 'custom'
        NovaLibvirtCPUModels: 'Haswell-noTSX'
        NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb'
    Note
    • CPU feature flags do not need to be configured to allow instances to only request 2 MB huge pages.
    • You can only allocate 1G huge pages to an instance if the host supports 1G huge page allocation.
    • You only need to set NovaLibvirtCPUModelExtraFlags to pdpe1gb when NovaLibvirtCPUMode is set to host-model or custom.
    • If the host supports pdpe1gb, and host-passthrough is used as the NovaLibvirtCPUMode, then you do not need to set pdpe1gb as a NovaLibvirtCPUModelExtraFlags. The pdpe1gb flag is only included in Opteron_G4 and Opteron_G5 CPU models, it is not included in any of the Intel CPU models supported by QEMU.
    • To mitigate for CPU hardware issues, such as Microarchitectural Data Sampling (MDS), you might need to configure other CPU flags. For more information, see RHOS Mitigation for MDS ("Microarchitectural Data Sampling") Security Flaws.
  4. To avoid loss of performance after applying Meltdown protection, configure the CPU feature flags, NovaLibvirtCPUModelExtraFlags, to include +pcid:

    parameter_defaults:
      ComputeParameters:
        NovaLibvirtCPUMode: 'custom'
        NovaLibvirtCPUModels: 'Haswell-noTSX'
        NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb, +pcid'
  5. Add NUMATopologyFilter to the NovaSchedulerDefaultFilters parameter, if not already present.
  6. Add your Compute environment file to the stack with your other environment files and deploy the overcloud:

    (undercloud)$ openstack overcloud deploy --templates \
      -e [your environment files]  \
      -e /home/stack/templates/<compute_environment_file>.yaml

3.3.1. Creating a huge pages flavor for instances

To enable your cloud users to create instances that use huge pages, you can create a flavor with the hw:mem_page_size extra spec key for launching instances.

Prerequisites

Procedure

  1. Create a flavor for instances that require huge pages:

    $ openstack flavor create --ram <size_mb> --disk <size_gb> \
     --vcpus <no_reserved_vcpus> huge_pages
  2. To request huge pages, set the hw:mem_page_size property of the flavor to the required size:

    $ openstack flavor set huge_pages --property hw:mem_page_size=1GB

    Set hw:mem_page_size to one of the following valid values:

    • large - Selects the largest page size supported on the host, which may be 2 MB or 1 GB on x86_64 systems.
    • small - (Default) Selects the smallest page size supported on the host. On x86_64 systems this is 4 kB (normal pages).
    • any - Selects the largest available huge page size, as determined by the libvirt driver.
    • <pagesize>: (String) Set an explicit page size if the workload has specific requirements. Use an integer value for the page size in KB, or any standard suffix. For example: 4KB, 2MB, 2048, 1GB.
  3. To verify the flavor creates an instance with huge pages, use your new flavor to launch an instance:

    $ openstack server create --flavor huge_pages \
     --image <image> huge_pages_instance

    The Compute scheduler identifies a host with enough free huge pages of the required size to back the memory of the instance. If the scheduler is unable to find a host and NUMA node with enough pages, then the request will fail with a NoValidHost error.

3.4. Configuring Compute nodes to use file-backed memory for instances

You can use file-backed memory to expand your Compute node memory capacity, by allocating files within the libvirt memory backing directory as instance memory. You can configure the amount of host disk that is available for instance memory, and the location on the disk of the instance memory files.

The Compute service reports the capacity configured for file-backed memory to the Placement service as the total system memory capacity. This allows the Compute node to host more instances than would normally fit within the system memory.

To use file-backed memory for instances, you must enable file-backed memory on the Compute node.

Limitations

  • You cannot live migrate instances between Compute nodes that have file-backed memory enabled and Compute nodes that do not have file-backed memory enabled.
  • File-backed memory is not compatible with huge pages. Instances that use huge pages cannot start on a Compute node with file-backed memory enabled. Use host aggregates to ensure that instances that use huge pages are not placed on Compute nodes with file-backed memory enabled.
  • File-backed memory is not compatible with memory overcommit.
  • You cannot reserve memory for host processes using NovaReservedHostMemory. When file-backed memory is in use, reserved memory corresponds to disk space not set aside for file-backed memory. File-backed memory is reported to the Placement service as the total system memory, with RAM used as cache memory.

Prerequisites

  • NovaRAMAllocationRatio must be set to "1.0" on the node and any host aggregate the node is added to.
  • NovaReservedHostMemory must be set to "0".

Procedure

  1. Open your Compute environment file.
  2. Configure the amount of host disk space, in MiB, to make available for instance RAM, by adding the following parameter to your Compute environment file:

    parameter_defaults:
      NovaLibvirtFileBackedMemory: 102400
  3. Optional: To configure the directory to store the memory backing files, set the QemuMemoryBackingDir parameter in your Compute environment file. If not set, the memory backing directory defaults to /var/lib/libvirt/qemu/ram/.

    Note

    You must locate your backing store in a directory at or above the default directory location, /var/lib/libvirt/qemu/ram/.

    You can also change the host disk for the backing store. For more information, see Changing the memory backing directory host disk.

  4. Save the updates to your Compute environment file.
  5. Add your Compute environment file to the stack with your other environment files and deploy the overcloud:

    (undercloud)$ openstack overcloud deploy --templates \
      -e [your environment files] \
      -e /home/stack/templates/<compute_environment_file>.yaml

3.4.1. Changing the memory backing directory host disk

You can move the memory backing directory from the default primary disk location to an alternative disk.

Procedure

  1. Create a file system on the alternative backing device. For example, enter the following command to create an ext4 filesystem on /dev/sdb:

    # mkfs.ext4 /dev/sdb
  2. Mount the backing device. For example, enter the following command to mount /dev/sdb on the default libvirt memory backing directory:

    # mount /dev/sdb /var/lib/libvirt/qemu/ram
    Note

    The mount point must match the value of the QemuMemoryBackingDir parameter.