Configuring the Compute Service for Instance Creation
A guide to configuring and managing the Red Hat OpenStack Platform Compute (nova) service for creating instances
Abstract
Making open source more inclusive
Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright’s message.
Part I. Compute (nova) service functionality
You use the Compute (nova) service to create, provision, and manage virtual machine instances and bare metal servers in a Red Hat OpenStack Platform (RHOSP) environment. The Compute service abstracts the underlying hardware that it runs on, rather than exposing specifics about the underlying host platforms. For example, rather than exposing the types and topologies of CPUs running on hosts, the Compute service exposes a number of virtual CPUs (vCPUs) and allows for overcommitting of these vCPUs.
To create and provision instances, the Compute service interacts with the following RHOSP services:
- Identity (keystone) service for authentication.
- Placement service for resource inventory tracking and selection.
- Image Service (glance) for disk and instance images.
- Networking (neutron) service for provisioning the virtual or physical networks that instances connect to on boot.
The Compute service consists of daemon processes and services, named nova-*
. The following are the core Compute services:
- Compute service (
nova-compute
) - This service creates, manages and terminates instances by using the libvirt for KVM or QEMU hypervisor APIs, and updates the database with instance states.
- Compute conductor (
nova-conductor
) -
This service mediates interactions between the Compute service and the database, which insulates Compute nodes from direct database access. Do not deploy this service on nodes where the
nova-compute
service runs. - Compute scheduler (
nova-scheduler
) - This service takes an instance request from the queue and determines on which Compute node to host the instance.
- Compute API (
nova-api
) - This service provides the external REST API to users.
- API database
- This database tracks instance location information, and provides a temporary location for instances that are built but not scheduled. In multi-cell deployments, this database also contains cell mappings that specify the database connection for each cell.
- Cell database
- This database contains most of the information about instances. It is used by the API database, the conductor, and the Compute services.
- Message queue
- This messaging service is used by all services to communicate with each other within the cell and with the global services.
- Compute metadata
-
This service stores data specific to instances. Instances access the metadata service at http://169.254.169.254 or over IPv6 at the link-local address fe80::a9fe:a9fe. The Networking (neutron) service is responsible for forwarding requests to the metadata API server. You must use the
NeutronMetadataProxySharedSecret
parameter to set a secret keyword in the configuration of both the Networking service and the Compute service to allow the services to communicate. The Compute metadata service can be run globally, as part of the Compute API, or in each cell.
You can deploy more than one Compute node. The hypervisor that operates instances is run on each Compute node. Each Compute node requires a minimum of two network interfaces. The Compute node also runs a Networking service agent that connects instances to virtual networks and provides firewalling services to instances through security groups.
By default, director installs the overcloud with a single cell for all Compute nodes. This cell contains all the Compute services and databases that control and manage the virtual machine instances, and all the instances and instance metadata. For larger deployments, you can deploy the overcloud with multiple cells to accommodate a larger number of Compute nodes. You can add cells to your environment when you install a new overcloud or at any time afterwards. For more information, see Scaling deployments with Compute cells.
Chapter 1. Configuring the Compute (nova) service
As a cloud administrator, you use environment files to customize the Compute (nova) service. Puppet generates and stores this configuration in the /var/lib/config-data/puppet-generated/<nova_container>/etc/nova/nova.conf
file. Use the following configuration methods to customize the Compute service configuration, in the following order of precedence:
Heat parameters - as detailed in the Compute (nova) Parameters section in the Overcloud Parameters guide. The following example uses heat parameters to set the default scheduler filters, and configure an NFS backend for the Compute service:
parameter_defaults: NovaSchedulerDefaultFilters: AggregateInstanceExtraSpecsFilter,RetryFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter NovaNfsEnabled: true NovaNfsShare: '192.0.2.254:/export/nova' NovaNfsOptions: 'context=system_u:object_r:nfs_t:s0' NovaNfsVersion: '4.2'
Puppet parameters - as defined in
/etc/puppet/modules/nova/manifests/*
:parameter_defaults: ComputeExtraConfig: nova::compute::force_raw_images: True
NoteOnly use this method if an equivalent heat parameter does not exist.
Manual hieradata overrides - for customizing parameters when no heat or Puppet parameter exists. For example, the following sets the
timeout_nbd
in the[DEFAULT]
section on the Compute role:parameter_defaults: ComputeExtraConfig: nova::config::nova_config: DEFAULT/timeout_nbd: value: '20'
If a heat parameter exists, use it instead of the Puppet parameter. If a Puppet parameter exists, but not a heat parameter, use the Puppet parameter instead of the manual override method. Use the manual override method only if there is no equivalent heat or Puppet parameter.
Follow the guidance in Identifying Parameters to Modify to determine if a heat or Puppet parameter is available for customizing a particular configuration.
For more information about how to configure overcloud services, see Parameters in the Advanced Overcloud Customization guide.
1.1. Configuring memory for overallocation
When you use memory overcommit (NovaRAMAllocationRatio
>= 1.0), you need to deploy your overcloud with enough swap space to support the allocation ratio.
If your NovaRAMAllocationRatio
parameter is set to < 1
, follow the RHEL recommendations for swap size. For more information, see Recommended system swap space in the RHEL Managing Storage Devices guide.
Prerequisites
- You have calculated the swap size your node requires. For more information, see Calculating swap size.
Procedure
Copy the
/usr/share/openstack-tripleo-heat-templates/environments/enable-swap.yaml
file to your environment file directory:$ cp /usr/share/openstack-tripleo-heat-templates/environments/enable-swap.yaml /home/stack/templates/enable-swap.yaml
Configure the swap size by adding the following parameters to your
enable-swap.yaml
file:parameter_defaults: swap_size_megabytes: <swap size in MB> swap_path: <full path to location of swap, default: /swap>
Add the
enable_swap.yaml
environment file to the stack with your other environment files and deploy the overcloud:(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/enable-swap.yaml
1.2. Calculating reserved host memory on Compute nodes
To determine the total amount of RAM to reserve for host processes, you need to allocate enough memory for each of the following:
- The resources that run on the host, for example, OSD consumes 3 GB of memory.
- The emulator overhead required to host instances.
- The hypervisor for each instance.
After you calculate the additional demands on memory, use the following formula to help you determine the amount of memory to reserve for host processes on each node:
NovaReservedHostMemory = total_RAM - ( (vm_no * (avg_instance_size + overhead)) + (resource1 * resource_ram) + (resourcen * resource_ram))
-
Replace
vm_no
with the number of instances. -
Replace
avg_instance_size
with the average amount of memory each instance can use. -
Replace
overhead
with the hypervisor overhead required for each instance. -
Replace
resource1
and all resources up to<resourcen>
with the number of a resource type on the node. -
Replace
resource_ram
with the amount of RAM each resource of this type requires.
1.3. Calculating swap size
The allocated swap size must be large enough to handle any memory overcommit. You can use the following formulas to calculate the swap size your node requires:
-
overcommit_ratio =
NovaRAMAllocationRatio
- 1 -
Minimum swap size (MB) =
(total_RAM * overcommit_ratio) + RHEL_min_swap
-
Recommended (maximum) swap size (MB) =
total_RAM * (overcommit_ratio + percentage_of_RAM_to_use_for_swap)
The percentage_of_RAM_to_use_for_swap
variable creates a buffer to account for QEMU overhead and any other resources consumed by the operating system or host services.
For instance, to use 25% of the available RAM for swap, with 64GB total RAM, and NovaRAMAllocationRatio
set to 1
:
- Recommended (maximum) swap size = 64000 MB * (0 + 0.25) = 16000 MB
For information about how to calculate the NovaReservedHostMemory
value, see Calculating reserved host memory on Compute nodes.
For information about how to determine the RHEL_min_swap
value, see Recommended system swap space in the RHEL Managing Storage Devices guide.
Chapter 2. Configuring Compute nodes for performance
As a cloud administrator, you can configure the scheduling and placement of instances for optimal performance by creating customized flavors to target specialized workloads, including NFV and High Performance Computing (HPC).
Use the following features to tune your instances for optimal performance:
- CPU pinning: Pin virtual CPUs to physical CPUs.
- Emulator threads: Pin emulator threads associated with the instance to physical CPUs.
- Huge pages: Tune instance memory allocation policies both for normal memory (4k pages) and huge pages (2 MB or 1 GB pages).
Configuring any of these features creates an implicit NUMA topology on the instance if there is no NUMA topology already present.
2.1. Configuring CPU pinning on the Compute nodes
You can configure instances to run on dedicated host CPUs. Enabling CPU pinning implicitly configures a NUMA topology on the instances. Each NUMA node of this NUMA topology maps to a separate host NUMA node. For more information about NUMA, see CPUs and NUMA nodes in the Network Functions Virtualization Product Guide.
Configure CPU pinning on your Compute node based on the NUMA topology of your host system. Reserve some CPU cores across all the NUMA nodes for the host processes for efficiency. Assign the remaining CPU cores to managing your instances.
The following example illustrates eight CPU cores spread across two NUMA nodes.
Table 2.1. Example of NUMA Topology
NUMA Node 0 | NUMA Node 1 | ||
Core 0 | Core 1 | Core 2 | Core 3 |
Core 4 | Core 5 | Core 6 | Core 7 |
You can schedule instances that have dedicated CPUs (pinned) and instances that have shared, or floating, CPUs (unpinned) on the same Compute node. The following procedure reserves cores 0 and 4 for host processes, cores 1, 3, 5 and 7 for instances that require CPU pinning, and cores 2 and 6 for floating instances that do not require CPU pinning.
If the host supports simultaneous multithreading (SMT), group thread siblings together in either the dedicated or the shared set. Thread siblings share some common hardware which means it is possible for a process running on one thread sibling to impact the performance of the other thread sibling.
For example, the host identifies four CPUs in a dual core CPU with SMT: 0, 1, 2, and 3. Of these four, there are two pairs of thread siblings:
- Thread sibling 1: CPUs 0 and 2
- Thread sibling 2: CPUs 1 and 3
In this scenario, you should not assign CPUs 0 and 1 as dedicated and 2 and 3 as shared. Instead, you should assign 0 and 2 as dedicated and 1 and 3 as shared.
Prerequisite
- You know the NUMA topology of your Compute node. For more information, see Discovering your NUMA node topology in the Network Functions Virtualization Planning and Configuration Guide.
Procedure
Reserve physical CPU cores for the dedicated instances by setting the
NovaComputeCpuDedicatedSet
configuration in the Compute environment file for each Compute node:NovaComputeCpuDedicatedSet=1,3,5,7
Reserve physical CPU cores for the shared instances by setting the
NovaComputeCpuSharedSet
configuration in the Compute environment file for each Compute node:NovaComputeCpuSharedSet=2,6
Set the
NovaReservedHostMemory
option in the same files to the amount of RAM to reserve for host processes. For example, if you want to reserve 512 MB, use:NovaReservedHostMemory=512
To ensure that host processes do not run on the CPU cores reserved for instances, set the parameter
IsolCpusList
in each Compute environment file to the CPU cores you have reserved for instances. Specify the value of theIsolCpusList
parameter using a list, or ranges, of CPU indices separated by a whitespace.IsolCpusList=1 2 3 5 6 7
-
To filter out hosts based on its NUMA topology, add
NUMATopologyFilter
to theNovaSchedulerDefaultFilters
parameter in each Compute environment file. Add your Compute environment file(s) to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
2.1.1. Creating a dedicated CPU flavor for instances
To enable your cloud users to create instances that have dedicated CPUs, you can create a flavor with a dedicated CPU policy for launching instances.
Prerequisites
- Simultaneous multithreading (SMT) is enabled on the host.
- The Compute node is configured to allow CPU pinning. For more information, see Configuring CPU pinning on the Compute nodes.
Procedure
Create a flavor for instances that require CPU pinning:
(overcloud)$ openstack flavor create --ram <size-mb> \ --disk <size-gb> --vcpus <no_reserved_vcpus> pinned_cpus
To request pinned CPUs, set the
hw:cpu_policy
property of the flavor todedicated
:(overcloud)$ openstack flavor set \ --property hw:cpu_policy=dedicated pinned_cpus
To place each vCPU on thread siblings, set the
hw:cpu_thread_policy
property of the flavor torequire
:(overcloud)$ openstack flavor set \ --property hw:cpu_thread_policy=require pinned_cpus
Note-
If the host does not have an SMT architecture or enough CPU cores with available thread siblings, scheduling will fail. To prevent this, set
hw:cpu_thread_policy
toprefer
instead ofrequire
. The (default)prefer
policy ensures that thread siblings are used when available. -
If you use
hw:cpu_thread_policy=isolate
, you must have SMT disabled or use a platform that does not support SMT.
-
If the host does not have an SMT architecture or enough CPU cores with available thread siblings, scheduling will fail. To prevent this, set
Verification
To verify the flavor creates an instance with dedicated CPUs, use your new flavor to launch an instance:
(overcloud)$ openstack server create --flavor pinned_cpus \ --image <image> pinned_cpu_instance
To verify correct placement of the new instance, enter the following command and check for
OS-EXT-SRV-ATTR:hypervisor_hostname
in the output:(overcloud)$ openstack server show pinned_cpu_instance
2.2. Configuring huge pages on Compute nodes
As a cloud administrator, you can configure Compute nodes to enable instances to request huge pages.
Procedure
- Open your Compute environment file.
Configure the amount of huge page memory to reserve on each NUMA node for processes that are not instances:
parameter_defaults: NovaReservedHugePages: ["node:0,size:2048,count:64","node:1,size:1GB,count:1"]
Replace the
size
value for each node with the size of the allocated huge page. Set to one of the following valid values:- 2048 (for 2MB)
- 1GB
-
Replace the
count
value for each node with the number of huge pages used by OVS per NUMA node. For example, for 4096 of socket memory used by Open vSwitch, set this to 2.
Optional: To allow instances to allocate 1GB huge pages, configure the CPU feature flags,
NovaLibvirtCPUModelExtraFlags
, to includepdpe1gb
:parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: 'Haswell-noTSX' NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb'
Note- CPU feature flags do not need to be configured to allow instances to only request 2 MB huge pages.
- You can only allocate 1G huge pages to an instance if the host supports 1G huge page allocation.
-
You only need to set
NovaLibvirtCPUModelExtraFlags
topdpe1gb
whenNovaLibvirtCPUMode
is set tohost-model
orcustom
. -
If the host supports
pdpe1gb
, andhost-passthrough
is used as theNovaLibvirtCPUMode
, then you do not need to setpdpe1gb
as aNovaLibvirtCPUModelExtraFlags
. Thepdpe1gb
flag is only included in Opteron_G4 and Opteron_G5 CPU models, it is not included in any of the Intel CPU models supported by QEMU. - To mitigate for CPU hardware issues, such as Microarchitectural Data Sampling (MDS), you might need to configure other CPU flags. For more information, see RHOS Mitigation for MDS ("Microarchitectural Data Sampling") Security Flaws.
To avoid loss of performance after applying Meltdown protection, configure the CPU feature flags,
NovaLibvirtCPUModelExtraFlags
, to include+pcid
:parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: 'Haswell-noTSX' NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb, +pcid'
TipFor more information, see Reducing the performance impact of Meltdown CVE fixes for OpenStack guests with "PCID" CPU feature flag.
-
Add
NUMATopologyFilter
to theNovaSchedulerDefaultFilters
parameter, if not already present. Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
2.2.1. Creating a huge pages flavor for instances
To enable your cloud users to create instances that use huge pages, you can create a flavor with the hw:mem_page_size
extra spec key for launching instances.
Prerequisites
- The Compute node is configured for huge pages. For more information, see Configuring huge pages on Compute nodes.
Procedure
Create a flavor for instances that require huge pages:
$ openstack flavor create --ram <size-mb> --disk <size-gb> \ --vcpus <no_reserved_vcpus> huge_pages
To request huge pages, set the
hw:mem_page_size
property of the flavor to the required size:$ openstack flavor set huge_pages --property hw:mem_page_size=1GB
Set
hw:mem_page_size
to one of the following valid values:-
large
- Selects the largest page size supported on the host, which may be 2 MB or 1 GB on x86_64 systems. -
small
- (Default) Selects the smallest page size supported on the host. On x86_64 systems this is 4 kB (normal pages). -
any
- Selects the largest available huge page size, as determined by the libvirt driver. - <pagesize>: (String) Set an explicit page size if the workload has specific requirements. Use an integer value for the page size in KB, or any standard suffix. For example: 4KB, 2MB, 2048, 1GB.
-
To verify the flavor creates an instance with huge pages, use your new flavor to launch an instance:
$ openstack server create --flavor huge_pages \ --image <image> huge_pages_instance
The Compute scheduler identifies a host with enough free huge pages of the required size to back the memory of the instance. If the scheduler is unable to find a host and NUMA node with enough pages, then the request will fail with a
NoValidHost
error.
2.3. Configuring Compute nodes to use file-backed memory for instances
You can use file-backed memory to expand your Compute node memory capacity, by allocating files within the libvirt memory backing directory as instance memory. You can configure the amount of host disk that is available for instance memory, and the location on the disk of the instance memory files.
The Compute service reports the capacity configured for file-backed memory to the Placement service as the total system memory capacity. This allows the Compute node to host more instances than would normally fit within the system memory.
To use file-backed memory for instances, you must enable file-backed memory on the Compute node.
Limitations
- You cannot live migrate instances between Compute nodes that have file-backed memory enabled and Compute nodes that do not have file-backed memory enabled.
- File-backed memory is not compatible with huge pages. Instances that use huge pages cannot start on a Compute node with file-backed memory enabled. Use host aggregates to ensure that instances that use huge pages are not placed on Compute nodes with file-backed memory enabled.
- File-backed memory is not compatible with memory overcommit.
-
You cannot reserve memory for host processes using
NovaReservedHostMemory
. When file-backed memory is in use, reserved memory corresponds to disk space not set aside for file-backed memory. File-backed memory is reported to the Placement service as the total system memory, with RAM used as cache memory.
Prerequisites
-
NovaRAMAllocationRatio
must be set to "1.0" on the node and any host aggregate the node is added to. -
NovaReservedHostMemory
must be set to "0".
Procedure
- Open your Compute environment file.
Configure the amount of host disk space, in MiB, to make available for instance RAM, by adding the following parameter to your Compute environment file:
parameter_defaults: NovaLibvirtFileBackedMemory: 102400
Optional: To configure the directory to store the memory backing files, set the
QemuMemoryBackingDir
parameter in your Compute environment file. If not set, the memory backing directory defaults to/var/lib/libvirt/qemu/ram/
.NoteYou must locate your backing store in a directory at or above the default directory location,
/var/lib/libvirt/qemu/ram/
.You can also change the host disk for the backing store. For more information, see Changing the memory backing directory host disk.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
2.3.1. Changing the memory backing directory host disk
You can move the memory backing directory from the default primary disk location to an alternative disk.
Procedure
Create a file system on the alternative backing device. For example, enter the following command to create an
ext4
filesystem on/dev/sdb
:# mkfs.ext4 /dev/sdb
Mount the backing device. For example, enter the following command to mount
/dev/sdb
on the default libvirt memory backing directory:# mount /dev/sdb /var/lib/libvirt/qemu/ram
NoteThe mount point must match the value of the
QemuMemoryBackingDir
parameter.
Chapter 3. Configuring PCI passthrough
You can use PCI passthrough to attach a physical PCI device, such as a graphics card or a network device, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.
Using PCI passthrough with routed provider networks
The Compute service does not support single networks that span multiple provider networks. When a network contains multiple physical networks, the Compute service only uses the first physical network. Therefore, if you are using routed provider networks you must use the same physical_network
name across all the Compute nodes.
If you use routed provider networks with VLAN or flat networks, you must use the same physical_network
name for all segments. You then create multiple segments for the network and map the segments to the appropriate subnets.
To enable your cloud users to create instances with PCI devices attached, you must complete the following:
- Designate Compute nodes for PCI passthrough.
- Configure the Compute nodes for PCI passthrough that have the required PCI devices.
- Deploy the overcloud.
- Create a flavor for launching instances with PCI devices attached.
Prerequisites
- The Compute nodes have the required PCI devices.
3.1. Designating Compute nodes for PCI passthrough
To designate Compute nodes for instances with physical PCI devices attached, you must:
- create a new role file to configure the PCI passthrough role
- configure a new overcloud flavor for PCI passthrough to use to tag the Compute nodes for PCI passthrough
Procedure
Generate a new roles data file named
roles_data_pci_passthrough.yaml
that includes theController
,Compute
, andComputePCI
roles:(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_pci_passthrough.yaml \ Compute:ComputePCI Compute Controller
Open
roles_data_pci_passthrough.yaml
and edit or add the following parameters and sections:Section/Parameter Current value New value Role comment
Role: Compute
Role: ComputePCI
Role name
name: Compute
name: ComputePCI
description
Basic Compute Node role
PCI Passthrough Compute Node role
HostnameFormatDefault
%stackname%-novacompute-%index%
%stackname%-novacomputepci-%index%
deprecated_nic_config_name
compute.yaml
compute-pci-passthrough.yaml
-
Register the PCI passthrough Compute nodes for the overcloud by adding them to your node definition template,
node.json
ornode.yaml
. For more information, see Registering nodes for the overcloud in the Director Installation and Usage guide. Inspect the node hardware:
(undercloud)$ openstack overcloud node introspect \ --all-manageable --provide
For more information, see Inspecting the hardware of nodes in the Director Installation and Usage guide.
Create the
compute-pci-passthrough
bare metal flavor to use to tag nodes that you want to designate for PCI passthrough:(undercloud)$ openstack flavor create --id auto \ --ram <ram_size_mb> --disk <disk_size_gb> \ --vcpus <no_vcpus> compute-pci-passthrough
-
Replace
<ram_size_mb>
with the RAM of the bare metal node, in MB. -
Replace
<disk_size_gb>
with the size of the disk on the bare metal node, in GB. Replace
<no_vcpus>
with the number of CPUs on the bare metal node.NoteThese properties are not used for scheduling instances. However, the Compute scheduler does use the disk size to determine the root partition size.
-
Replace
Tag each bare metal node that you want to designate for PCI passthrough with a custom PCI passthrough resource class:
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.PCI-PASSTHROUGH <node>
Replace
<node>
with the ID of the bare metal node.Associate the
compute-pci-passthrough
flavor with the custom PCI passthrough resource class:(undercloud)$ openstack flavor set \ --property resources:CUSTOM_BAREMETAL_PCI_PASSTHROUGH=1 \ compute-pci-passthrough
To determine the name of a custom resource class that corresponds to a resource class of a Bare Metal service node, convert the resource class to uppercase, replace all punctuation with an underscore, and prefix with
CUSTOM_
.NoteA flavor can request only one instance of a bare metal resource class.
Set the following flavor properties to prevent the Compute scheduler from using the bare metal flavor properties to schedule instances:
(undercloud)$ openstack flavor set \ --property resources:VCPU=0 --property resources:MEMORY_MB=0 \ --property resources:DISK_GB=0 compute-pci-passthrough
Add the following parameters to the
node-info.yaml
file to specify the number of PCI passthrough Compute nodes, and the flavor to use for the PCI passthrough designated Compute nodes:parameter_defaults: OvercloudComputePCIFlavor: compute-pci-passthrough ComputePCICount: 3
To verify that the role was created, enter the following command:
(undercloud)$ openstack overcloud profiles list
3.2. Configuring a PCI passthrough Compute node
To enable your cloud users to create instances with PCI devices attached, you must configure both the Compute nodes that have the PCI devices and the Controller nodes.
Procedure
-
Create an environment file to configure the Controller node on the overcloud for PCI passthrough, for example,
pci_passthrough_controller.yaml
. Add
PciPassthroughFilter
to theNovaSchedulerDefaultFilters
parameter inpci_passthrough_controller.yaml
:parameter_defaults: NovaSchedulerDefaultFilters: ['AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']
To specify the PCI alias for the devices on the Controller node, add the following configuration to
pci_passthrough_controller.yaml
:parameter_defaults: ... ControllerExtraConfig: nova::pci::aliases: - name: "a1" product_id: "1572" vendor_id: "8086" device_type": "type-PF"
For more information about configuring the
device_type
field, see PCI passthrough device type field.NoteIf the
nova-api
service is running in a role different from theController
role, replaceControllerExtraConfig
with the user role in the format<Role>ExtraConfig
.Optional: To set a default NUMA affinity policy for PCI passthrough devices, add
numa_policy
to thenova::pci::aliases:
configuration from step 3:parameter_defaults: ... ControllerExtraConfig: nova::pci::aliases: - name: "a1" product_id: "1572" vendor_id: "8086" device_type: "type-PF" numa_policy: "preferred"
-
To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthrough_compute.yaml
. To specify the available PCIs for the devices on the Compute node, use the
vendor_id
andproduct_id
options to add all matching PCI devices to the pool of PCI devices available for passthrough to instances. For example, to add all Intel® Ethernet Controller X710 devices to the pool of PCI devices available for passthrough to instances, add the following configuration topci_passthrough_compute.yaml
:parameter_defaults: ... ComputePCIParameters: NovaPCIPassthrough: - vendor_id: "8086" product_id: "1572"
For more information about how to configure
NovaPCIPassthrough
, see Guidelines for configuringNovaPCIPassthrough
.You must create a copy of the PCI alias on the Compute node for instance migration and resize operations. To specify the PCI alias for the devices on the PCI passthrough Compute node, add the following to
pci_passthrough_compute.yaml
:parameter_defaults: ... ComputePCIExtraConfig: nova::pci::aliases: - name: "a1" product_id: "1572" vendor_id: "8086" device_type": "type-PF"
NoteThe Compute node aliases must be identical to the aliases on the Controller node. Therefore, if you added
numa_affinity
tonova::pci::aliases
inpci_passthrough_controller.yaml
, then you must also add it tonova::pci::aliases
inpci_passthrough_compute.yaml
.To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the
KernelArgs
parameter topci_passthrough_compute.yaml
. For example, use the followingKernalArgs
settings to enable an Intel IOMMU:parameter_defaults: ... ComputePCIParameters: KernelArgs: "intel_iommu=on iommu=pt"
To enable an AMD IOMMU, set
KernelArgs
to"amd_iommu=on iommu=pt"
.Add your custom environment files to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/pci_passthrough_controller.yaml \ -e /home/stack/templates/pci_passthrough_compute.yaml \
Create and configure the flavors that your cloud users can use to request the PCI devices. The following example requests two devices, each with a vendor ID of
8086
and a product ID of1572
, using the alias defined in step 7:(overcloud)# openstack flavor set \ --property "pci_passthrough:alias"="a1:2" device_passthrough
Optional: To override the default NUMA affinity policy for PCI passthrough devices, you can add the NUMA affinity policy property key to the flavor or the image:
To override the default NUMA affinity policy by using the flavor, add the
hw:pci_numa_affinity_policy
property key:(overcloud)# openstack flavor set \ --property "hw:pci_numa_affinity_policy"="required" \ device_passthrough
For more information about the valid values for
hw:pci_numa_affinity_policy
, see Flavor metadata.To override the default NUMA affinity policy by using the image, add the
hw_pci_numa_affinity_policy
property key:(overcloud)# openstack image set \ --property hw_pci_numa_affinity_policy=required \ device_passthrough_image
NoteIf you set the NUMA affinity policy on both the image and the flavor then the property values must match. The flavor setting takes precedence over the image and default settings. Therefore, the configuration of the NUMA affinity policy on the image only takes effect if the property is not set on the flavor.
Verification
Create an instance with a PCI passthrough device:
# openstack server create --flavor device_passthrough \ --image <image> --wait test-pci
- Log in to the instance as a cloud user. For more information, see Log in to an Instance.
To verify that the PCI device is accessible from the instance, enter the following command from the instance:
$ lspci -nn | grep <device_name>
3.3. PCI passthrough device type field
The Compute service categorizes PCI devices into one of three types, depending on the capabilities the devices report. The following lists the valid values that you can set the device_type
field to:
type-PF
- The device supports SR-IOV and is the parent or root device. Specify this device type to passthrough a device that supports SR-IOV in its entirety.
type-VF
- The device is a child device of a device that supports SR-IOV.
type-PCI
-
The device does not support SR-IOV. This is the default device type if the
device_type
field is not set.
You must configure the Compute and Controller nodes with the same device_type
.
3.4. Guidelines for configuring NovaPCIPassthrough
-
Do not use the
devname
parameter when configuring PCI passthrough, as the device name of a NIC can change. Instead, usevendor_id
andproduct_id
because they are more stable, or use theaddress
of the NIC. -
To use the
product_id
parameter to pass through a Physical Function (PF), you must also specify theaddress
of the PF. However, you can use just theaddress
parameter to specify PFs, because the address is unique on each host. -
To pass through all the Virtual Functions (VFs) you must specify only the
product_id
andvendor_id
. You must also specify theaddress
if you are using SRIOV for NIC partitioning and you are running OVS on a VF. -
To pass through only the VFs for a PF but not the PF itself, you can use the
address
parameter to specify the PCI address of the PF andproduct_id
to specify the product ID of the VF.
Chapter 4. Creating and managing host aggregates
As a cloud administrator, you can partition a Compute deployment into logical groups for performance or administrative purposes. Red Hat OpenStack Platform (RHOSP) provides the following mechanisms for partitioning logical groups:
- Host aggregate
A host aggregate is a grouping of Compute nodes into a logical unit based on attributes such as the hardware or performance characteristics. You can assign a Compute node to one or more host aggregates.
You can map flavors and images to host aggregates by setting metadata on the host aggregate, and then matching flavor extra specs or image metadata properties to the host aggregate metadata. The Compute scheduler can use this metadata to schedule instances when the required filters are enabled. Metadata that you specify in a host aggregate limits the use of that host to any instance that has the same metadata specified in its flavor or image.
You can configure weight multipliers for each host aggregate by setting the
xxx_weight_multiplier
configuration option in the host aggregate metadata.You can use host aggregates to handle load balancing, enforce physical isolation or redundancy, group servers with common attributes, or separate classes of hardware.
When you create a host aggregate, you can specify a zone name. This name is presented to cloud users as an availability zone that they can select.
- Availability zones
An availability zone is the cloud user view of a host aggregate. A cloud user cannot view the Compute nodes in the availability zone, or view the metadata of the availability zone. The cloud user can only see the name of the availability zone.
You can assign each Compute node to only one availability zone. You can configure a default availability zone where instances will be scheduled when the cloud user does not specify a zone. You can direct cloud users to use availability zones that have specific capabilities.
4.1. Enabling scheduling on host aggregates
To schedule instances on host aggregates that have specific attributes, update the configuration of the Compute scheduler to enable filtering based on the host aggregate metadata.
Procedure
- Open your Compute environment file.
Add the following values to the
NovaSchedulerDefaultFilters
parameter, if they are not already present:AggregateInstanceExtraSpecsFilter
: Add this value to filter Compute nodes by host aggregate metadata that match flavor extra specs.NoteFor this filter to perform as expected, you must scope the flavor extra specs by prefixing the
extra_specs
key with theaggregate_instance_extra_specs:
namespace.AggregateImagePropertiesIsolation
: Add this value to filter Compute nodes by host aggregate metadata that match image metadata properties.NoteTo filter host aggregate metadata by using image metadata properties, the host aggregate metadata key must match a valid image metadata property. For information about valid image metadata properties, see Image metadata.
AvailabilityZoneFilter
: Add this value to filter by availability zone when launching an instance.NoteInstead of using the
AvailabilityZoneFilter
Compute scheduler service filter, you can use the Placement service to process availability zone requests. For more information, see Filtering by availability zone using the Placement service.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
4.2. Creating a host aggregate
As a cloud administrator, you can create as many host aggregates as you require.
Procedure
To create a host aggregate, enter the following command:
(overcloud)# openstack aggregate create <aggregate_name>
Replace
<aggregate_name>
with the name you want to assign to the host aggregate.Add metadata to the host aggregate:
(overcloud)# openstack aggregate set \ --property <key=value> \ --property <key=value> \ <aggregate_name>
-
Replace
<key=value>
with the metadata key-value pair. If you are using theAggregateInstanceExtraSpecsFilter
filter, the key can be any arbitrary string, for example,ssd=true
. If you are using theAggregateImagePropertiesIsolation
filter, the key must match a valid image metadata property. For more information about valid image metadata properties, see Image metadata. -
Replace
<aggregate_name>
with the name of the host aggregate.
-
Replace
Add the Compute nodes to the host aggregate:
(overcloud)# openstack aggregate add host \ <aggregate_name> \ <host_name>
-
Replace
<aggregate_name>
with the name of the host aggregate to add the Compute node to. -
Replace
<host_name>
with the name of the Compute node to add to the host aggregate.
-
Replace
Create a flavor or image for the host aggregate:
Create a flavor:
(overcloud)$ openstack flavor create \ --ram <size-mb> \ --disk <size-gb> \ --vcpus <no_reserved_vcpus> \ host-agg-flavor
Create an image:
(overcloud)$ openstack image create host-agg-image
Set one or more key-value pairs on the flavor or image that match the key-value pairs on the host aggregate.
To set the key-value pairs on a flavor, use the scope
aggregate_instance_extra_specs
:(overcloud)# openstack flavor set \ --property aggregate_instance_extra_specs:ssd=true \ host-agg-flavor
To set the key-value pairs on an image, use valid image metadata properties as the key:
(overcloud)# openstack image set \ --property os_type=linux \ host-agg-image
4.3. Creating an availability zone
As a cloud administrator, you can create an availability zone that cloud users can select when they create an instance.
Procedure
To create an availability zone, you can create a new availability zone host aggregate, or make an existing host aggregate an availability zone:
To create a new availability zone host aggregate, enter the following command:
(overcloud)# openstack aggregate create \ --zone <availability_zone> \ <aggregate_name>
-
Replace
<availability_zone>
with the name you want to assign to the availability zone. -
Replace
<aggregate_name>
with the name you want to assign to the host aggregate.
-
Replace
To make an existing host aggregate an availability zone, enter the following command:
(overcloud)# openstack aggregate set --zone <availability_zone> \ <aggregate_name>
-
Replace
<availability_zone>
with the name you want to assign to the availability zone. -
Replace
<aggregate_name>
with the name of the host aggregate.
-
Replace
Optional: Add metadata to the availability zone:
(overcloud)# openstack aggregate set --property <key=value> \ <aggregate_name>
-
Replace
<key=value>
with your metadata key-value pair. You can add as many key-value properties as required. -
Replace
<aggregate_name>
with the name of the availability zone host aggregate.
-
Replace
Add Compute nodes to the availability zone host aggregate:
(overcloud)# openstack aggregate add host <aggregate_name> \ <host_name>
-
Replace
<aggregate_name>
with the name of the availability zone host aggregate to add the Compute node to. -
Replace
<host_name>
with the name of the Compute node to add to the availability zone.
-
Replace
4.4. Deleting a host aggregate
To delete a host aggregate, you first remove all the Compute nodes from the host aggregate.
Procedure
To view a list of all the Compute nodes assigned to the host aggregate, enter the following command:
(overcloud)# openstack aggregate show <aggregate_name>
To remove all assigned Compute nodes from the host aggregate, enter the following command for each Compute node:
(overcloud)# openstack aggregate remove host <aggregate_name> \ <host_name>
-
Replace
<aggregate_name>
with the name of the host aggregate to remove the Compute node from. -
Replace
<host_name>
with the name of the Compute node to remove from the host aggregate.
-
Replace
After you remove all the Compute nodes from the host aggregate, enter the following command to delete the host aggregate:
(overcloud)# openstack aggregate delete <aggregate_name>
4.5. Creating a project-isolated host aggregate
You can create a host aggregate that is available only to specific projects. Only the projects that you assign to the host aggregate can launch instances on the host aggregate.
Project isolation uses the Placement service to filter host aggregates for each project. This process supersedes the functionality of the AggregateMultiTenancyIsolation
filter. You therefore do not need to use the AggregateMultiTenancyIsolation
filter.
Procedure
- Open your Compute environment file.
-
To schedule project instances on the project-isolated host aggregate, set the
NovaSchedulerLimitTenantsToPlacementAggregate
parameter toTrue
in the Compute environment file. Optional: To ensure that only the projects that you assign to a host aggregate can create instances on your cloud, set the
NovaSchedulerPlacementAggregateRequiredForTenants
parameter toTrue
.NoteNovaSchedulerPlacementAggregateRequiredForTenants
isFalse
by default. When this parameter isFalse
, projects that are not assigned to a host aggregate can create instances on any host aggregate.- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml \
- Create the host aggregate.
Retrieve the list of project IDs:
(overcloud)# openstack project list
Use the
filter_tenant_id<suffix>
metadata key to assign projects to the host aggregate:(overcloud)# openstack aggregate set \ --property filter_tenant_id<ID0>=<project_id0> \ --property filter_tenant_id<ID1>=<project_id1> \ ... --property filter_tenant_id<IDn>=<project_idn> \ <aggregate_name>
-
Replace
<ID0>
,<ID1>
, and all IDs up to<IDn>
with unique values for each project filter that you want to create. -
Replace
<project_id0>
,<project_id1>
, and all project IDs up to<project_idn>
with the ID of each project that you want to assign to the host aggregate. Replace
<aggregate_name>
with the name of the project-isolated host aggregate.For example, use the following syntax to assign projects
78f1
,9d3t
, andaa29
to the host aggregateproject-isolated-aggregate
:(overcloud)# openstack aggregate set \ --property filter_tenant_id0=78f1 \ --property filter_tenant_id1=9d3t \ --property filter_tenant_id2=aa29 \ project-isolated-aggregate
TipYou can create a host aggregate that is available only to a single specific project by omitting the suffix from the
filter_tenant_id
metadata key:(overcloud)# openstack aggregate set \ --property filter_tenant_id=78f1 \ single-project-isolated-aggregate
-
Replace
Additional resources
- For more information on creating a host aggregate, see Creating and managing host aggregates.
Chapter 5. Configuring instance scheduling and placement
The Compute scheduler service determines on which Compute node or host aggregate to place an instance. When the Compute (nova) service receives a request to launch or move an instance, it uses the specifications provided in the request, the flavor, and the image to find a suitable host. For example, a flavor can specify the traits an instance requires a host to have, such as the type of storage disk, or the Intel CPU instruction set extension.
The Compute scheduler service uses the configuration of the following components, in the following order, to determine on which Compute node to launch or move an instance:
- Placement service prefilters: The Compute scheduler service uses the Placement service to filter the set of candidate Compute nodes based on specific attributes. For example, the Placement service automatically excludes disabled Compute nodes.
- Filters: Used by the Compute scheduler service to determine the initial set of Compute nodes on which to launch an instance.
- Weights: The Compute scheduler service prioritizes the filtered Compute nodes using a weighting system. The highest weight has the highest priority.
In the following diagram, host 1 and 3 are eligible after filtering. Host 1 has the highest weight and therefore has the highest priority for scheduling.
5.1. Prefiltering using the Placement service
The Compute (nova) service interacts with the Placement service when it creates and manages instances. The Placement service tracks the inventory and usage of resource providers, such as a Compute node, a shared storage pool, or an IP allocation pool, and their available quantitative resources, such as the available vCPUs. Any service that needs to manage the selection and consumption of resources can use the Placement service.
The Placement service also tracks the mapping of available qualitative resources to resource providers, such as the type of storage disk trait a resource provider has.
The Placement service applies prefilters to the set of candidate Compute nodes based on Placement service resource provider inventories and traits. You can create prefilters based on the following criteria:
- Supported image types
- Traits
- Projects or tenants
- Availability zone
5.1.1. Filtering by requested image type support
You can exclude Compute nodes that do not support the disk format of the image used to launch an instance. This is useful when your environment uses Red Hat Ceph Storage as an ephemeral backend, which does not support QCOW2 images. Enabling this feature ensures that the scheduler does not send requests to launch instances using a QCOW2 image to Compute nodes backed by Red Hat Ceph Storage.
Procedure
- Open your Compute environment file.
-
To exclude Compute nodes that do not support the disk format of the image used to launch an instance, set the
NovaSchedulerQueryImageType
parameter toTrue
in the Compute environment file. - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
5.1.2. Filtering by resource provider traits
Each resource provider has a set of traits. Traits are the qualitative aspects of a resource provider, for example, the type of storage disk, or the Intel CPU instruction set extension.
The Compute node reports its capabilities to the Placement service as traits. An instance can specify which of these traits it requires, or which traits the resource provider must not have. The Compute scheduler can use these traits to identify a suitable Compute node or host aggregate to host an instance.
To enable your cloud users to create instances on hosts that have particular traits, you can define a flavor that requires or forbids a particular trait, and you can create an image that requires or forbids a particular trait.
For a list of the available traits, see the os-traits
library. You can also create custom traits, as required.
5.1.2.1. Creating an image that requires or forbids a resource provider trait
You can create an instance image that your cloud users can use to launch instances on hosts that have particular traits.
Prerequisites
-
To query the placement service, install the
python3-osc-placement
package on the undercloud.
Procedure
Create a new image:
(overcloud)$ openstack image create ... trait-image
Identify the trait you require a host or host aggregate to have. You can select an existing trait, or create a new trait:
To use an existing trait, list the existing traits to retrieve the trait name:
(overcloud)$ openstack --os-placement-api-version 1.6 trait list
To create a new trait, enter the following command:
(overcloud)$ openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAME
Custom traits must begin with the prefix
CUSTOM_
and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character.
To schedule instances on a host or host aggregate that has a required trait, add the trait to the image extra specs. For example, to schedule instances on a host or host aggregate that supports AVX-512, add the following trait to the image extra specs:
(overcloud)$ openstack image set \ --property trait:HW_CPU_X86_AVX512BW=required \ trait-image
To filter out hosts or host aggregates that have a forbidden trait, add the trait to the image extra specs. For example, to prevent instances from being scheduled on a host or host aggregate that supports multi-attach volumes, add the following trait to the image extra specs:
(overcloud)$ openstack image set \ --property trait:COMPUTE_VOLUME_MULTI_ATTACH=forbidden \ trait-image
5.1.2.2. Creating a flavor that requires or forbids a resource provider trait
You can create flavors that your cloud users can use to launch instances on hosts that have particular traits.
Prerequisites
-
To query the placement service, install the
python3-osc-placement
package on the undercloud.
Procedure
Create a flavor:
(overcloud)$ openstack flavor create --vcpus 1 --ram 512 \ --disk 2 trait-flavor
Identify the trait you require a host or host aggregate to have. You can select an existing trait, or create a new trait:
To use an existing trait, list the existing traits to retrieve the trait name:
(overcloud)$ openstack --os-placement-api-version 1.6 trait list
To create a new trait, enter the following command:
(overcloud)$ openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAME
Custom traits must begin with the prefix
CUSTOM_
and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character.
To schedule instances on a host or host aggregate that has a required trait, add the trait to the flavor extra specs. For example, to schedule instances on a host or host aggregate that supports AVX-512, add the following trait to the flavor extra specs:
(overcloud)$ openstack flavor set \ --property trait:HW_CPU_X86_AVX512BW=required \ trait-flavor
To filter out hosts or host aggregates that have a forbidden trait, add the trait to the flavor extra specs. For example, to prevent instances from being scheduled on a host or host aggregate that supports multi-attach volumes, add the following trait to the flavor extra specs:
(overcloud)$ openstack flavor set \ --property trait:COMPUTE_VOLUME_MULTI_ATTACH=forbidden \ trait-flavor
5.1.3. Filtering by isolating host aggregates
You can restrict scheduling on a host aggregate to only those instances whose flavor and image traits match the metadata of the host aggregate. The combination of flavor and image metadata must require all the host aggregate traits to be eligible for scheduling on Compute nodes in that host aggregate.
Prerequisites
-
To query the placement service, install the
python3-osc-placement
package on the undercloud.
Procedure
- Open your Compute environment file.
-
To isolate host aggregates to host only instances whose flavor and image traits match the aggregate metadata, set the
NovaSchedulerEnableIsolatedAggregateFiltering
parameter toTrue
in the Compute environment file. - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
Identify the traits you want to isolate the host aggregate for. You can select an existing trait, or create a new trait:
To use an existing trait, list the existing traits to retrieve the trait name:
(overcloud)$ openstack --os-placement-api-version 1.6 trait list
To create a new trait, enter the following command:
(overcloud)$ openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAME
Custom traits must begin with the prefix
CUSTOM_
and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character.
Collect the existing traits of each Compute node:
(overcloud)$ traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')
Add the traits to the resource providers for each Compute node in the host aggregate:
(overcloud)$ openstack --os-placement-api-version 1.6 \ resource provider trait set --trait $traits \ --trait CUSTOM_TRAIT_NAME \ <host_uuid>
- Repeat steps 6 and 7 for each Compute node in the host aggregate.
Add the metadata property for the trait to the host aggregate:
(overcloud)$ openstack --os-compute-api-version 2.53 aggregate set \ --property trait:TRAIT_NAME=required <aggregate_name>
Add the trait to a flavor or an image:
(overcloud)$ openstack flavor set \ --property trait:<TRAIT_NAME>=required <flavor> (overcloud)$ openstack image set \ --property trait:<TRAIT_NAME>=required <image>
5.1.4. Filtering by availability zone using the Placement service
You can use the Placement service to honor availability zone requests. To use the Placement service to filter by availability zone, placement aggregates must exist that match the membership and UUID of the availability zone host aggregates.
When using the Placement service to filter by availability zone, you can remove the AvailabilityZoneFilter
filter from NovaSchedulerDefaultFilters
.
Prerequisites
-
To query the placement service, install the
python3-osc-placement
package on the undercloud.
Procedure
- Open your Compute environment file.
-
To use the Placement service to filter by availability zone, set the
NovaSchedulerQueryPlacementForAvailabilityZone
parameter toTrue
in the Compute environment file. - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
- Create the host aggregate to use as an availability zone.
Add the availability zone host aggregate as a resource provider to the placement aggregate:
$ openstack --os-placement-api-version=1.2 resource provider \ aggregate set --aggregate <az_agg_uuid> <resource_provider_uuid>
Additional resources
- For more information on creating a host aggregate to use as an availability zone, see Creating an availability zone.
5.2. Configuring filters and weights for the Compute scheduler service
You need to configure the filters and weights for the Compute scheduler service to determine the initial set of Compute nodes on which to launch an instance.
Procedure
- Open your Compute environment file.
Add the filters you want the scheduler to use to the
NovaSchedulerDefaultFilters
parameter, for example:parameter_defaults: NovaSchedulerDefaultFilters: AggregateInstanceExtraSpecsFilter,RetryFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter
Specify which attribute to use to calculate the weight of each Compute node, for example:
parameter_defaults: ComputeExtraConfig: nova::config::nova_config: DEFAULT/scheduler_weight_classes: value: nova.scheduler.weights.all_weighers
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
Additional resources
- For a list of the available Compute scheduler service filters, see Compute scheduler filters.
- For a list of the available weight configuration options, see Compute scheduler weights.
5.3. Compute scheduler filters
You configure the NovaSchedulerDefaultFilters
parameter in your Compute environment file to specify the filters the Compute scheduler must apply when selecting an appropriate Compute node to host an instance. The default configuration applies the following filters:
-
AvailabilityZoneFilter
: The Compute node must be in the requested availability zone. -
ComputeFilter
: The Compute node can service the request. -
ComputeCapabilitiesFilter
: The Compute node satisfies the flavor extra specs. -
ImagePropertiesFilter
: The Compute node satisfies the requested image properties. -
ServerGroupAntiAffinityFilter
: The Compute node is not already hosting an instance in a specified group. -
ServerGroupAffinityFilter
: The Compute node is already hosting instances in a specified group.
You can add and remove filters. The following table describes all the available filters.
Table 5.1. Compute scheduler filters
Filter | Description |
---|---|
| Use this filter to match the image metadata of an instance with host aggregate metadata. If any of the host aggregate metadata matches the metadata of the image, then the Compute nodes that belong to that host aggregate are candidates for launching instances from that image. The scheduler only recognises valid image metadata properties. For details on valid image metadata properties, see Image metadata properties. |
| Use this filter to match namespaced properties defined in the flavor extra specs of an instance with host aggregate metadata.
You must scope your flavor If any of the host aggregate metadata matches the metadata of the flavor extra spec, then the Compute nodes that belong to that host aggregate are candidates for launching instances from that image. |
|
Use this filter to filter hosts by I/O operations with a per-aggregate |
|
Use this filter to limit the availability of Compute nodes in project-isolated host aggregates to a specified set of projects. Only projects specified using the Note
The project can still place instances on other hosts. To restrict this, use the |
|
Use this filter to limit the number of instances each Compute node in an aggregate can host. You can configure the maximum number of instances per-aggregate by using the |
|
Use this filter to pass hosts if no flavor metadata key is set, or the flavor aggregate metadata value contains the name of the requested flavor. The value of the flavor metadata entry is a string that may contain either a single flavor name or a comma-separated list of flavor names, such as |
| Use this filter to consider all available Compute nodes for instance scheduling. Note Using this filter does not disable other filters. |
| Use this filter to launch instances on a Compute node in the availability zone specified by the instance. |
|
Use this filter to match namespaced properties defined in the flavor extra specs of an instance against the Compute node capabilities. You must prefix the flavor extra specs with the
A more efficient alternative to using the |
| Use this filter to pass all Compute nodes that are operational and enabled. This filter should always be present. |
|
Use this filter to enable scheduling of an instance on a different Compute node from a set of specific instances. To specify these instances when launching an instance, use the $ openstack server create --image cedef40a-ed67-4d10-800e-17455edce175 \ --flavor 1 --hint different_host=a0cf03a5-d921-4877-bb5c-86d26cf818e1 \ --hint different_host=8c19174f-4220-44f0-824a-cd1eeef10287 server-1 |
| Use this filter to filter Compute nodes based on the following properties defined on the instance image:
Compute nodes that can support the specified image properties contained in the instance are passed to the scheduler. For more information on image properties, see Image metadata properties. |
|
Use this filter to only schedule instances with isolated images on isolated Compute nodes. You can also prevent non-isolated images from being used to build instances on isolated Compute nodes by configuring
To specify the isolated set of images and hosts use the parameter_defaults: ComputeExtraConfig: nova::config::nova_config: filter_scheduler/isolated_hosts: value: server1, server2 filter_scheduler/isolated_images: value: 342b492c-128f-4a42-8d3a-c5088cf27d13, ebd267a6-ca86-4d6c-9a0e-bd132d6b7d09 |
|
Use this filter to filter out hosts that have concurrent I/O operations that exceed the configured |
|
Use this filter to limit scheduling to Compute nodes that report the metrics configured by using To use this filter, add the following configuration to your Compute environment file: parameter_defaults: ComputeExtraConfig: nova::config::nova_config: DEFAULT/compute_monitors: value: 'cpu.virt_driver'
By default, the Compute scheduler service updates the metrics every 60 seconds. To ensure the metrics are up-to-date, you can increase the frequency at which the metrics data is refreshed using the parameter_defaults: ComputeExtraConfig: nova::config::nova_config: DEFAULT/update_resources_interval: value: '2' |
|
Use this filter to schedule instances with a NUMA topology on NUMA-capable Compute nodes. Use flavor |
|
Use this filter to filter out Compute nodes that have more instances running than specified by the |
|
Use this filter to schedule instances on Compute nodes that have the devices that the instance requests by using the flavor Use this filter if you want to reserve nodes with PCI devices, which are typically expensive and limited, for instances that request them. |
|
Deprecated Use this filter to filter out Compute nodes that have failed a scheduling attempt. This filter is valid when |
|
Use this filter to enable scheduling of an instance on the same Compute node as a set of specific instances. To specify these instances when launching an instance, use the $ openstack server create --image cedef40a-ed67-4d10-800e-17455edce175 \ --flavor 1 --hint same_host=a0cf03a5-d921-4877-bb5c-86d26cf818e1 \ --hint same_host=8c19174f-4220-44f0-824a-cd1eeef10287 server-1 |
| Use this filter to schedule instances in an affinity server group on the same Compute node. To create the server group, enter the following command: $ openstack server group create --policy affinity <group-name>
To launch an instance in this group, use the $ openstack server create --image <image> \ --flavor <flavor> \ --hint group=<group-uuid> <vm-name> |
| Use this filter to schedule instances that belong to an anti-affinity server group on different Compute nodes. To create the server group, enter the following command: $ openstack server group create --policy anti-affinity <group-name>
To launch an instance in this group, use the $ openstack server create --image <image> \ --flavor <flavor> \ --hint group=<group-uuid> <vm-name> |
|
Use this filter to schedule instances on Compute nodes that have a specific IP subnet range. To specify the required range, use the $ openstack server create --image <image> \ --flavor <flavor> \ --hint build_near_host_ip=<ip-address> \ --hint cidr=<subnet-mask> <vm-name> |
5.4. Compute scheduler weights
Each Compute node has a weight that the scheduler can use to prioritize instance scheduling. After the scheduler applies the filters, it selects the Compute node with the largest weight from the remaining candidate Compute nodes.
Each weigher has a multiplier that the scheduler applies after normalising the weight of the Compute node. The Compute scheduler uses the following formula to calculate the weight of a Compute node:
w1_multiplier * norm(w1) + w2_multiplier * norm(w2) + ...
The following table describes the available configuration options for weights.
Weights can be set on host aggregates using the aggregate metadata key with the same name as the options detailed in the following table. If set on the host aggregate, the host aggregate value takes precedence.
Table 5.2. Compute scheduler weights
Configuration option | Type | Description |
---|---|---|
| String | Use this parameter to configure which of the following attributes to use for calculating the weight of each Compute node:
|
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the available RAM. Set to a positive value to prefer hosts with more available RAM, which spreads instances across many hosts. Set to a negative value to prefer hosts with less available RAM, which fills up (stacks) hosts as much as possible before scheduling to a less-used host. The absolute value, whether positive or negative, controls how strong the RAM weigher is relative to other weighers. Default: 1.0 - The scheduler spreads instances across all hosts evenly. |
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the available disk space. Set to a positive value to prefer hosts with more available disk space, which spreads instances across many hosts. Set to a negative value to prefer hosts with less available disk space, which fills up (stacks) hosts as much as possible before scheduling to a less-used host. The absolute value, whether positive or negative, controls how strong the disk weigher is relative to other weighers. Default: 1.0 - The scheduler spreads instances across all hosts evenly. |
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the available vCPUs. Set to a positive value to prefer hosts with more available vCPUs, which spreads instances across many hosts. Set to a negative value to prefer hosts with less available vCPUs, which fills up (stacks) hosts as much as possible before scheduling to a less-used host. The absolute value, whether positive or negative, controls how strong the vCPU weigher is relative to other weighers. Default: 1.0 - The scheduler spreads instances across all hosts evenly. |
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the host workload. Set to a negative value to prefer hosts with lighter workloads, which distributes the workload across more hosts. Set to a positive value to prefer hosts with heavier workloads, which schedules instances onto hosts that are already busy. The absolute value, whether positive or negative, controls how strong the I/O operations weigher is relative to other weighers. Default: -1.0 - The scheduler distributes the workload across more hosts. |
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on recent build failures. Set to a positive value to increase the significance of build failures recently reported by the host. Hosts with recent build failures are then less likely to be chosen.
Set to Default: 1000000.0 |
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts during a cross-cell move. This option determines how much weight is placed on a host which is within the same source cell when moving an instance. By default, the scheduler prefers hosts within the same source cell when migrating an instance. Set to a positive value to prefer hosts within the same cell the instance is currently running. Set to a negative value to prefer hosts located in a different cell from that where the instance is currently running. Default: 1000000.0 |
| Positive floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the number of PCI devices on the host and the number of PCI devices requested by an instance. If an instance requests PCI devices, then the more PCI devices a Compute node has the higher the weight allocated to the Compute node. For example, if there are three hosts available, one with a single PCI device, one with multiple PCI devices and one without any PCI devices, then the Compute scheduler prioritizes these hosts based on the demands of the instance. The scheduler should prefer the first host if the instance requests one PCI device, the second host if the instance requires multiple PCI devices and the third host if the instance does not request a PCI device. Configure this option to prevent non-PCI instances from occupying resources on hosts with PCI devices. Default: 1.0 |
| Integer | Use this parameter to specify the size of the subset of filtered hosts from which to select the host. You must set this option to at least 1. A value of 1 selects the first host returned by the weighing functions. The scheduler ignores any value less than 1 and uses 1 instead. Set to a value greater than 1 to prevent multiple scheduler processes handling similar requests selecting the same host, creating a potential race condition. By selecting a host randomly from the N hosts that best fit the request, the chance of a conflict is reduced. However, the higher you set this value, the less optimal the chosen host may be for a given request. Default: 1 |
| Positive floating point | Use this parameter to specify the multiplier to use to weigh hosts for group soft-affinity. Note You need to specify the microversion when creating a group with this policy: $ openstack --os-compute-api-version 2.15 server group create --policy soft-affinity <group-name> Default: 1.0 |
| Positive floating point | Use this parameter to specify the multiplier to use to weigh hosts for group soft-anti-affinity. Note You need to specify the microversion when creating a group with this policy: $ openstack --os-compute-api-version 2.15 server group create --policy soft-affinity <group-name> Default: 1.0 |
| Floating point |
Use this parameter to specify the multiplier to use for weighting metrics. By default, Set to a number greater than 1.0 to increase the effect of the metric on the overall weight. Set to a number between 0.0 and 1.0 to reduce the effect of the metric on the overall weight.
Set to 0.0 to ignore the metric value and return the value of the Set to a negative number to prioritize the host with lower metrics, and stack instances in hosts. Default: 1.0 |
|
Comma-separated list of | Use this parameter to specify the metrics to use for weighting, and the ratio to use to calculate the weight of each metric. Valid metric names:
Example: |
| Boolean |
Use this parameter to specify how to handle configured
|
| Floating point |
Use this parameter to specify the weight to use if any Default: -10000.0 |
Chapter 6. Creating flavors for launching instances
An instance flavor is a resource template that specifies the virtual hardware profile for the instance. Cloud users must specify a flavor when they launch an instance.
A flavor can specify the quantity of the following resources the Compute service must allocate to an instance:
- The number of vCPUs.
- The RAM, in MB.
- The root disk, in GB.
- The virtual storage, including secondary ephemeral storage and swap disk.
You can specify who can use flavors by making the flavor public to all projects, or private to specific projects or domains.
Flavors can use metadata, also referred to as "extra specs", to specify instance hardware support and quotas. The flavor metadata influences the instance placement, resource usage limits, and performance. For a complete list of available metadata properties, see Flavor metadata.
You can also use the flavor metadata keys to find a suitable host aggregate to host the instance, by matching the extra_specs
metadata set on the host aggregate. To schedule an instance on a host aggregate, you must scope the flavor metadata by prefixing the extra_specs
key with the aggregate_instance_extra_specs:
namespace. For more information, see Creating and managing host aggregates.
A Red Hat OpenStack Platform (RHOSP) deployment includes the following set of default public flavors that your cloud users can use.
Table 6.1. Default Flavors
Name | vCPUs | RAM | Root Disk Size |
---|---|---|---|
m1.nano | 1 | 128 MB | 1 GB |
m1.micro | 1 | 192 MB | 1 GB |
Behavior set using flavor properties override behavior set using images. When a cloud user launches an instance, the properties of the flavor they specify override the properties of the image they specify.
6.1. Creating a flavor
You can create and manage specialized flavors for specific functionality or behaviors, for example:
- Change default memory and capacity to suit the underlying hardware needs.
- Add metadata to force a specific I/O rate for the instance or to match a host aggregate.
Procedure
Create a flavor that specifies the basic resources to make available to an instance:
(overcloud)$ openstack flavor create --ram <size-mb> \ --disk <size-gb> --vcpus <no_reserved_vcpus> <flavor-name>
-
Replace
<size-mb>
with the size of RAM to allocate to an instance created with this flavor. -
Replace
<size-gb>
with the size of root disk to allocate to an instance created with this flavor. -
Replace
<no-vcpus>
with the number of vCPUs to reserve for an instance created with this flavor. Replace
<flavor-name>
with a unique name for your flavor.For more information about the arguments, see Flavor arguments.
-
Replace
Optional: To specify flavor metadata, set the required properties by using key-value pairs:
(overcloud)$ openstack flavor set \ --property <key=value> --property <key=value> ... <flavor-name>
-
Replace
<key>
with the metadata key of the property you want to allocate to an instance that is created with this flavor. For a list of available metadata keys, see Flavor metadata. -
Replace
<value>
with the value of the metadata key you want to allocate to an instance that is created with this flavor. Replace
<flavor-name>
with the name of your flavor.For example, an instance that is launched by using the following flavor has two CPU sockets, each with two CPUs:
(overcloud)$ openstack flavor set \ --property hw:cpu_sockets=2 \ --property hw:cpu_cores=2 processor_topology_flavor
-
Replace
Optional: To make the flavor accessible only by a particular project or group of users, set the following attributes:
(overcloud)$ openstack flavor set --private --project <project-id> <flavor-name>
-
Replace
<project-id>
with the ID of the project that can use this flavor to create instances. -
Replace
<flavor-name>
with the name of your flavor.
-
Replace
6.2. Flavor arguments
The openstack flavor create
command has one positional argument, <flavor-name>
, to specify the name of your new flavor.
The following table details the optional arguments that you can specify as required when you create a new flavor.
Table 6.2. Optional flavor arguments
Optional argument | Description |
---|---|
|
Unique ID for the flavor. The default value, |
| (Mandatory) Size of memory to make available to the instance, in MB. Default: 256 MB |
| (Mandatory) Amount of disk space to use for the root (/) partition, in GB. The root disk is an ephemeral disk that the base image is copied into. When an instance boots from a persistent volume, the root disk is not used. Note
Creation of an instance with a flavor that has Default: 0 GB |
| Amount of disk space to use for the ephemeral disks, in GB. Defaults to 0 GB, which means that no secondary ephemeral disk is created. Ephemeral disks offer machine local disk storage linked to the lifecycle of the instance. Ephemeral disks are not included in any snapshots. This disk is destroyed and all data is lost when the instance is deleted. Default: 0 GB |
| Swap disk size in MB. Default: 0 GB |
| (Mandatory) Number of virtual CPUs for the instance. Default: 1 |
| The flavor is available to all projects. By default, a flavor is public and available to all projects. |
|
The flavor is only available to the projects specified by using the |
| Metadata, or "extra specs", specified by using key-value pairs in the following format:
Repeat this option to set multiple properties. |
|
Specifies the project that can use the private flavor. You must use this argument with the Repeat this option to allow access to multiple projects. |
|
Specifies the project domain that can use the private flavor. You must use this argument with the Repeat this option to allow access to multiple project domains. |
| Description of the flavor. Limited to 65535 characters in length. You can use only printable characters. |
6.3. Flavor metadata
Use the --property
option to specify flavor metadata when you create a flavor. Flavor metadata is also referred to as “extra specs”. Flavor metadata determines instance hardware support and quotas, which influence instance placement, instance limits, and performance.
Instance resource usage
Use the property keys in the following table to configure limits on CPU, memory and disk I/O usage by instances.
Table 6.3. Flavor metadata for resource usage
Key | Description |
---|---|
|
Specifies the proportional weighted share of CPU time for the domain. Defaults to the OS provided defaults. The Compute scheduler weighs this value relative to the setting of this property on other instances in the same domain. For example, an instance that is configured with |
|
Specifies the period of time within which to enforce the |
|
Specifies the maximum allowed bandwidth for the vCPU in each
You can use $ openstack flavor set cpu_limits_flavor \ --property quota:cpu_quota=10000 \ --property quota:cpu_period=20000 |
Instance disk tuning
Use the property keys in the following table to tune the instance disk performance.
Table 6.4. Flavor metadata for disk tuning
Key | Description |
---|---|
| Specifies the maximum disk reads available to an instance, in bytes per second. |
| Specifies the maximum disk reads available to an instance, in IOPS. |
| Specifies the maximum disk writes available to an instance, in bytes per second. |
| Specifies the maximum disk writes available to an instance, in IOPS. |
| Specifies the maximum I/O operations available to an instance, in bytes per second. |
| Specifies the maximum I/O operations available to an instance, in IOPS. |
Instance network traffic bandwidth
Use the property keys in the following table to configure bandwidth limits on the instance network traffic by configuring the VIF I/O options.
The quota:vif_*
properties are deprecated. Instead, you should use the Networking (neutron) service Quality of Service (QoS) policies. For more information about QoS policies, see Configuring Quality of Service (QoS) policies in the Networking Guide. The quota:vif_*
properties are only supported when you use the ML2/OVS mechanism driver with NeutronOVSFirewallDriver
set to iptables_hybrid
.
Table 6.5. Flavor metadata for bandwidth limits
Key | Description |
---|---|
| (Deprecated) Specifies the required average bit rate on the traffic incoming to the instance, in kbps. |
| (Deprecated) Specifies the maximum amount of incoming traffic that can be burst at peak speed, in KB. |
| (Deprecated) Specifies the maximum rate at which the instance can receive incoming traffic, in kbps. |
| (Deprecated) Specifies the required average bit rate on the traffic outgoing from the instance, in kbps. |
| (Deprecated) Specifies the maximum amount of outgoing traffic that can be burst at peak speed, in KB. |
| (Deprecated) Specifies the maximum rate at which the instance can send outgoing traffic, in kbps. |
Hardware video RAM
Use the property key in the following table to configure limits on the instance RAM to use for video devices.
Table 6.6. Flavor metadata for video devices
Key | Description |
---|---|
|
Specifies the maximum RAM to use for video devices, in MB. Use with the |
Watchdog behavior
Use the property key in the following table to enable the virtual hardware watchdog device on the instance.
Table 6.7. Flavor metadata for watchdog behavior
Key | Description |
---|---|
|
Specify to enable the virtual hardware watchdog device and set its behavior. Watchdog devices perform the configured action if the instance hangs or fails. The watchdog uses the i6300esb device, which emulates a PCI Intel 6300ESB. If Set to one of the following valid values:
|
Random number generator
Use the property keys in the following table to enable the random number generator device on the instance.
Table 6.8. Flavor metadata for random number generator
Key | Description |
---|---|
|
Set to |
| Specifies the maximum number of bytes that the instance can read from the entropy of the host, per period. |
| Specifies the duration of the read period in milliseconds. |
Virtual Performance Monitoring Unit (vPMU)
Use the property key in the following table to enable the vPMU for the instance.
Table 6.9. Flavor metadata for vPMU
Key | Description |
---|---|
|
Set to
Tools such as |
Instance CPU topology
Use the property keys in the following table to define the topology of the processors in the instance.
Table 6.10. Flavor metadata for CPU topology
Key | Description |
---|---|
| Specifies the preferred number of sockets for the instance. Default: the number of vCPUs requested |
| Specifies the preferred number of cores per socket for the instance.
Default: |
| Specifies the preferred number of threads per core for the instance.
Default: |
| Specifies the maximum number of sockets that users can select for their instances by using image properties.
Example: |
| Specifies the maximum number of cores per socket that users can select for their instances by using image properties. |
| Specifies the maximum number of threads per core that users can select for their instances by using image properties. |
Serial ports
Use the property key in the following table to configure the number of serial ports per instance.
Table 6.11. Flavor metadata for serial ports
Key | Description |
---|---|
| Maximum serial ports per instance. |
CPU pinning policy
By default, instance virtual CPUs (vCPUs) are sockets with one core and one thread. You can use properties to create flavors that pin the vCPUs of instances to the physical CPU cores (pCPUs) of the host. You can also configure the behavior of hardware CPU threads in a simultaneous multithreading (SMT) architecture where one or more cores have thread siblings.
Use the property keys in the following table to define the CPU pinning policy of the instance.
Table 6.12. Flavor metadata for CPU pinning
Key | Description |
---|---|
| Specifies the CPU policy to use. Set to one of the following valid values:
|
|
Specifies the CPU thread policy to use when
|
Instance PCI NUMA affinity policy
Use the property key in the following table to create flavors that specify the NUMA affinity policy for PCI passthrough devices and SR-IOV interfaces.
Table 6.13. Flavor metadata for PCI NUMA affinity policy
Key | Description |
---|---|
| Specifies the NUMA affinity policy for PCI passthrough devices and SR-IOV interfaces. Set to one of the following valid values:
|
Instance NUMA topology
You can use properties to create flavors that define the host NUMA placement for the instance vCPU threads, and the allocation of instance vCPUs and memory from the host NUMA nodes.
Defining a NUMA topology for the instance improves the performance of the instance OS for flavors whose memory and vCPU allocations are larger than the size of NUMA nodes in the Compute hosts.
The Compute scheduler uses these properties to determine a suitable host for the instance. For example, a cloud user launches an instance by using the following flavor:
$ openstack flavor set numa_top_flavor \ --property hw:numa_nodes=2 \ --property hw:numa_cpus.0=0,1,2,3,4,5 \ --property hw:numa_cpus.1=6,7 \ --property hw:numa_mem.0=3072 \ --property hw:numa_mem.1=1024
The Compute scheduler searches for a host that has two NUMA nodes, one with 3GB of RAM and the ability to run six CPUs, and the other with 1GB of RAM and two CPUS. If a host has a single NUMA node with capability to run eight CPUs and 4GB of RAM, the Compute scheduler does not consider it a valid match.
NUMA topologies defined by a flavor cannot be overridden by NUMA topologies defined by the image. The Compute service raises an ImageNUMATopologyForbidden
error if the image NUMA topology conflicts with the flavor NUMA topology.
You cannot use this feature to constrain instances to specific host CPUs or NUMA nodes. Use this feature only after you complete extensive testing and performance measurements. You can use the hw:pci_numa_affinity_policy
property instead.
Use the property keys in the following table to define the instance NUMA topology.
Table 6.14. Flavor metadata for NUMA topology
Key | Description |
---|---|
| Specifies the number of host NUMA nodes to restrict execution of instance vCPU threads to. If not specified, the vCPU threads can run on any number of the available host NUMA nodes. |
| A comma-separated list of instance vCPUs to map to instance NUMA node N. If this key is not specified, vCPUs are evenly divided among available NUMA nodes. N starts from 0. Use *.N values with caution, and only if you have at least two NUMA nodes.
This property is valid only if you have set |
| The number of MB of instance memory to map to instance NUMA node N. If this key is not specified, memory is evenly divided among available NUMA nodes. N starts from 0. Use *.N values with caution, and only if you have at least two NUMA nodes.
This property is valid only if you have set |
If the combined values of hw:numa_cpus.N
or hw:numa_mem.N
are greater than the available number of CPUs or memory respectively, the Compute service raises an exception.
Instance memory encryption
Use the property key in the following table to enable encryption of instance memory.
Table 6.15. Flavor metadata for memory encryption
Key | Description |
---|---|
|
Set to |
CPU real-time policy
Use the property keys in the following table to define the real-time policy of the processors in the instance.
- Although most of your instance vCPUs can run with a real-time policy, you must mark at least one vCPU as non-real-time to use for both non-real-time guest processes and emulator overhead processes.
- To use this extra spec, you must enable pinned CPUs.
Table 6.16. Flavor metadata for CPU real-time policy
Key | Description |
---|---|
|
Set to
Default: |
| Specifies the vCPUs to not assign a real-time policy to. You must prepend the mask value with a caret symbol (^). The following example indicates that all vCPUs except vCPUs 0 and 1 have a real-time policy: $ openstack flavor set <flavor> \ --property hw:cpu_realtime=”yes” \ --property hw:cpu_realtime_mask=^0-1 Note
If the |
Emulator threads policy
You can assign a pCPU to an instance to use for emulator threads. Emulator threads are emulator processes that are not directly related to the instance. A dedicated emulator thread pCPU is required for real-time workloads. To use the emulator threads policy, you must enable pinned CPUs by setting the following property:
--property hw:cpu_policy=dedicated
Use the property key in the following table to define the emulator threads policy of the instance.
Table 6.17. Flavor metadata for the emulator threads policy
Key | Description |
---|---|
| Specifies the emulator threads policy to use for instances. Set to one of the following valid values:
|
Instance memory page size
Use the property keys in the following table to create an instance with an explicit memory page size.
Table 6.18. Flavor metadata for memory page size
Key | Description |
---|---|
|
Specifies the size of large pages to use to back the instances. Use of this option creates an implicit NUMA topology of 1 NUMA node unless otherwise specified by
|
PCI passthrough
Use the property key in the following table to attach a physical PCI device, such as a graphics card or a network device, to an instance. For more information about using PCI passthrough, see Configuring PCI passthrough.
Table 6.19. Flavor metadata for PCI passthrough
Key | Description |
---|---|
| Specifies the PCI device to assign to an instance by using the following format: <alias>:<count>
|
Hypervisor signature
Use the property key in the following table to hide the hypervisor signature from the instance.
Table 6.20. Flavor metadata for hiding hypervisor signature
Key | Description |
---|---|
|
Set to |
Instance resource traits
Each resource provider has a set of traits. Traits are the qualitative aspects of a resource provider, for example, the type of storage disk, or the Intel CPU instruction set extension. An instance can specify which of these traits it requires.
The traits that you can specify are defined in the os-traits
library. Example traits include the following:
-
COMPUTE_TRUSTED_CERTS
-
COMPUTE_NET_ATTACH_INTERFACE_WITH_TAG
-
COMPUTE_IMAGE_TYPE_RAW
-
HW_CPU_X86_AVX
-
HW_CPU_X86_AVX512VL
-
HW_CPU_X86_AVX512CD
For details about how to use the os-traits
library, see https://docs.openstack.org/os-traits/latest/user/index.html.
Use the property key in the following table to define the resource traits of the instance.
Table 6.21. Flavor metadata for resource traits
Key | Description |
---|---|
| Specifies Compute node traits. Set the trait to one of the following valid values:
Example: $ openstack flavor set --property trait:HW_CPU_X86_AVX512BW=required avx512-flavor |
Instance bare-metal resource class
Use the property key in the following table to request a bare-metal resource class for an instance.
Table 6.22. Flavor metadata for bare-metal resource class
Key | Description |
---|---|
| Use this property to specify standard bare-metal resource classes to override the values of, or to specify custom bare-metal resource classes that the instance requires.
The standard resource classes that you can override are
The name of custom resource classes must start with
For example, to schedule instances on a node that has $ openstack flavor set \ --property resources:CUSTOM_BAREMETAL_SMALL=1 \ --property resources:VCPU=0 --property resources:MEMORY_MB=0 \ --property resources:DISK_GB=0 compute-small |
Chapter 7. Adding metadata to instances
The Compute (nova) service uses metadata to pass configuration information to instances on launch. The instance can access the metadata by using a config drive or the metadata service.
- Config drive
- Config drives are special drives that you can attach to an instance when it boots. The config drive is presented to the instance as a read-only drive. The instance can mount this drive and read files from it to get information that is normally available through the metadata service.
- Metadata service
-
The Compute service provides the metadata service as a REST API, which can be used to retrieve data specific to an instance. Instances access this service at
169.254.169.254
or atfe80::a9fe:a9fe
.
7.1. Types of instance metadata
Cloud users, cloud administrators, and the Compute service can pass metadata to instances:
- Cloud user provided data
- Cloud users can specify additional data to use when they launch an instance, such as a shell script that the instance runs on boot. The cloud user can pass data to instances by using the user data feature, and by passing key-value pairs as required properties when creating or updating an instance.
- Cloud administrator provided data
The RHOSP administrator uses the vendordata feature to pass data to instances. The Compute service provides the vendordata modules
StaticJSON
andDynamicJSON
to allow administrators to pass metadata to instances:-
StaticJSON
: (Default) Use for metadata that is the same for all instances. -
DynamicJSON
: Use for metadata that is different for each instance. This module makes a request to an external REST service to determine what metadata to add to an instance.
Vendordata configuration is located in one of the following read-only files on the instance:
-
/openstack/{version}/vendor_data.json
-
/openstack/{version}/vendor_data2.json
-
- Compute service provided data
- The Compute service uses its internal implementation of the metadata service to pass information to the instance, such as the requested hostname for the instance, and the availability zone the instance is in. This happens by default and requires no configuration by the cloud user or administrator.
7.2. Adding a config drive to all instances
As an administrator, you can configure the Compute service to always create a config drive for instances, and populate the config drive with metadata that is specific to your deployment. For example, you might use a config drive for the following reasons:
- To pass a networking configuration when your deployment does not use DHCP to assign IP addresses to instances. You can pass the IP address configuration for the instance through the config drive, which the instance can mount and access before you configure the network settings for the instance.
- To pass data to an instance that is not known to the user starting the instance, for example, a cryptographic token to be used to register the instance with Active Directory post boot.
- To create a local cached disk read to manage the load of instance requests, which reduces the impact of instances accessing the metadata servers regularly to check in and build facts.
Any instance operating system that is capable of mounting an ISO 9660 or VFAT file system can use the config drive.
Procedure
- Open your Compute environment file.
To always attach a config drive when launching an instance, set the following parameter to
True
:parameter_defaults: ComputeExtraConfig: nova::compute::force_config_drive: 'true'
Optional: To change the format of the config drive from the default value of
iso9660
tovfat
, add theconfig_drive_format
parameter to your configuration:parameter_defaults: ComputeExtraConfig: nova::compute::force_config_drive: 'true' nova::compute::config_drive_format: vfat
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml \
Verification
Create an instance:
(overcloud)$ openstack server create --flavor m1.tiny \ --image cirros test-config-drive-instance
- Log in to the instance.
Mount the config drive:
If the instance OS uses
udev
:# mkdir -p /mnt/config # mount /dev/disk/by-label/config-2 /mnt/config
If the instance OS does not use
udev
, you need to first identify the block device that corresponds to the config drive:# blkid -t LABEL="config-2" -odevice /dev/vdb # mkdir -p /mnt/config # mount /dev/vdb /mnt/config
-
Inspect the files in the mounted config drive directory,
mnt/config/openstack/{version}/
, for your metadata.
7.3. Adding static metadata to instances
You can make static metadata available to all instances in your deployment.
Procedure
- Create the JSON file for the metadata.
- Open your Compute environment file.
Add the path to the JSON file to your environment file:
parameter_defaults: ComputeExtraConfig: nova::config::nova_config: ... api/vendordata_jsonfile_path: value: <path_to_the_JSON_file>
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml \
7.4. Adding dynamic metadata to instances
You can configure your deployment to create instance-specific metadata, and make the metadata available to that instance through a JSON file.
You can use dynamic metadata on the undercloud to integrate director with a Red Hat Identity Management (IdM) server. An IdM server can be used as a certificate authority and manage the overcloud certificates when SSL/TLS is enabled on the overcloud. For more information, see Add the undercloud to IdM.
Procedure
- Open your Compute environment file.
Add
DynamicJSON
to the vendordata provider module:parameter_defaults: ComputeExtraConfig: nova::config::nova_config: ... api/vendordata_providers: value: StaticJSON,DynamicJSON
Specify the REST services to contact to generate the metadata. You can specify as many target REST services as required, for example:
parameter_defaults: ComputeExtraConfig: nova::config::nova_config: ... api/vendordata_providers: value: StaticJSON,DynamicJSON api/vendordata_dynamic_targets: value: target1@http://127.0.0.1:125 api/vendordata_dynamic_targets: value: target2@http://127.0.0.1:126
The Compute service generates the JSON file,
vendordata2.json
, to contain the metadata retrieved from the configured target services, and stores it in the config drive directory.NoteDo not use the same name for a target service more than once.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
Chapter 8. Configuring SEV-capable Compute nodes to provide memory encryption for instances
This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.
As a cloud administrator, you can provide cloud users the ability to create instances that run on SEV-capable Compute nodes with memory encryption enabled.
To enable your cloud users to create instances that use memory encryption, you must complete the following procedures:
- Configure the Compute nodes that have the SEV-capable hardware.
- Create a SEV-enabled flavor or image for launching instances.
8.1. Secure Encrypted Virtualization (SEV)
Secure Encrypted Virtualization (SEV), provided by AMD, protects the data in DRAM that a running virtual machine instance is using. SEV encrypts the memory of each instance with a unique key.
SEV increases security when you use non-volatile memory technology (NVDIMM), because an NVDIMM chip can be physically removed from a system with the data intact, similar to a hard drive. Without encryption, any stored information such as sensitive data, passwords, or secret keys can be compromised.
For more information, see the AMD Secure Encrypted Virtualization (SEV) documentation.
Limitations of SEV-encrypted instances
- You cannot live migrate, or suspend and resume SEV-encrypted instances.
- You cannot use PCI passthrough on SEV-encrypted instances to directly access devices.
You cannot use virtio-blk as the boot disk of SEV-encrypted instances.
NoteYou can use virtio-scsi or SATA as the boot disk, or virtio-blk for non-boot disks.
- The operating system running in an encrypted instance must contain SEV support.
- Machines that support SEV have a limited number of slots in their memory controller for storing encryption keys. Each running instance with encrypted memory consumes one of these slots. Therefore, the number of SEV instances that can run concurrently is limited to the number of slots in the memory controller. For example, on AMD EPYC Zen 1 the limit is 16, and on AMD EPYC Zen 2, the limit is 255.
- Memory-encrypted instances pin pages in RAM. The Compute service cannot swap these pages, therefore you cannot safely overcommit a Compute node that hosts memory-encrypted instances.
8.2. Configuring a SEV-capable Compute node
To enable your cloud users to create instances that use memory encryption, you must configure the Compute nodes that have the SEV-capable hardware.
Prerequisites
Your deployment must include a Compute node that runs on AMD hardware capable of supporting SEV, such as an AMD EPYC CPU. You can use the following command to determine if your deployment is SEV-capable:
$ lscpu | grep sev
- Your deployment must include libvirt 4.5 or later, which includes support for SEV.
Procedure
- Open your Compute environment file.
Optional: Add the following configuration to your Compute environment file to specify the maximum number of memory-encrypted instances the SEV-capable Compute node can host concurrently:
parameter_defaults: ComputeExtraConfig: nova::config::nova_config: libvirt/num_memory_encrypted_guests: value: 15
NoteIf not set,
libvirt/num_memory_encrypted_guests
defaults tonone
, which means the SEV-capable Compute node does not impose a limit on the number of memory-encrypted instances that can be hosted concurrently. Instead, the hardware determines the maximum number of memory-encrypted instances the SEV-capable Compute node can host concurrently, which might cause some memory-encrypted instances to fail to launch.Optional: To specify that all x86_64 images use the q35 machine type by default, add the
NovaHWMachineType
parameter to the Compute environment file, and set it tox86_64=q35
.This configuration removes the need to set the
hw_machine_type
property toq35
on every SEV-enabled instance image.-
To prevent memory overcommit, set the
NovaRAMAllocationRatio
parameter to1.0
in the Compute environment file. -
To ensure that the SEV-capable Compute nodes reserve enough memory for host-level services to function, add 16MB for each potential SEV instance (the maximum number of concurrent SEV instances), to your value for
NovaReservedHostMemory
in the Compute environment file. Add the following configuration to your Compute environment file to schedule memory-encrypted instances on a SEV-capable Compute node aggregate:
parameter_defaults: ControllerExtraConfig: nova::config::nova_config: scheduler/enable_isolated_aggregate_filtering: value: 'True'
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
Create a host aggregate for SEV Compute nodes to ensure that instances that do not request memory encryption are not created on SEV-capable hosts:
(undercloud)$ source ~/overcloudrc (overcloud)$ openstack aggregate create sev_agg (overcloud)$ openstack aggregate add host sev_agg hostA (overcloud)$ openstack aggregate add host sev_agg hostB (overcloud)$ openstack --os-compute-api-version 2.53 aggregate set --property trait:HW_CPU_X86_AMD_SEV=required sev_agg
8.3. Creating a SEV-enabled image for instances
When the overcloud contains SEV-capable Compute nodes, you can create a SEV-enabled instance image that your cloud users can use to launch instances that have memory encryption.
Procedure
Create a new image for SEV:
(overcloud)$ openstack image create ... \ --property hw_firmware_type=uefi sev-image
NoteIf you use an existing image, the image must have the
hw_firmware_type
property set touefi
.Optional: Add the property
hw_mem_encryption=True
to the image to enable SEV memory encryption on the image:(overcloud)$ openstack image set \ --property hw_mem_encryption=True sev-image
TipYou can enable SEV memory encryption on the flavor. For more information, see Creating a SEV-enabled flavor for instances.
Optional: Set the machine type to
q35
, if not already set in the Compute node configuration:(overcloud)$ openstack image set \ --property hw_machine_type=q35 sev-image
Optional: To schedule memory-encrypted instances on a SEV-capable host aggregate, add the following trait to the image extra specs:
(overcloud)$ openstack image set \ --property trait:HW_CPU_X86_AMD_SEV=required sev-image
TipYou can also specify this trait on the flavor. For more information, see Creating a SEV-enabled flavor for instances.
8.4. Creating a SEV-enabled flavor for instances
When the overcloud contains SEV-capable Compute nodes, you can create one or more SEV-enabled flavors that your cloud users can use to launch instances that have memory encryption.
Procedure
Create a flavor for SEV:
(overcloud)$ openstack flavor create --vcpus 1 --ram 512 --disk 2 \ --property hw:mem_encryption=True m1.small-sev
To schedule memory-encrypted instances on a SEV-capable host aggregate, add the following trait to the flavor extra specs:
(overcloud)$ openstack flavor set \ --property trait:HW_CPU_X86_AMD_SEV=required m1.small-sev
Chapter 9. Configuring virtual GPUs for instances
To support GPU-based rendering on your instances, you can define and manage virtual GPU (vGPU) resources according to your available physical GPU devices and your hypervisor type. You can use this configuration to divide the rendering workloads between all your physical GPU devices more effectively, and to have more control over scheduling your vGPU-enabled instances.
To enable vGPU in the Compute (nova) service, create flavors that your cloud users can use to create Red Hat Enterprise Linux (RHEL) instances with vGPU devices. Each instance can then support GPU workloads with virtual GPU devices that correspond to the physical GPU devices.
The Compute service tracks the number of vGPU devices that are available for each GPU profile you define on each host. The Compute service schedules instances to these hosts based on the flavor, attaches the devices, and monitors usage on an ongoing basis. When an instance is deleted, the Compute service adds the vGPU devices back to the available pool.
9.1. Supported configurations and limitations
Supported GPU cards
For a list of supported NVIDIA GPU cards, see Virtual GPU Software Supported Products on the NVIDIA website.
Limitations when using vGPU devices
- You can enable only one vGPU type on each Compute node.
- Each instance can use only one vGPU resource.
- Live migration of vGPU between hosts is not supported.
- Suspend operations on a vGPU-enabled instance is not supported due to a libvirt limitation. Instead, you can snapshot or shelve the instance.
- Resize and cold migration operations on an instance with a vGPU flavor does not automatically re-allocate the vGPU resources to the instance. After you resize or migrate the instance, you must rebuild it manually to re-allocate the vGPU resources.
- By default, vGPU types on Compute hosts are not exposed to API users. To grant access, add the hosts to a host aggregate. For more information, see Creating and managing host aggregates.
- If you use NVIDIA accelerator hardware, you must comply with the NVIDIA licensing requirements. For example, NVIDIA vGPU GRID requires a licensing server. For more information about the NVIDIA licensing requirements, see NVIDIA License Server Release Notes on the NVIDIA website.
9.2. Configuring vGPU on the Compute nodes
To enable your cloud users to create instances that use a virtual GPU (vGPU), you must configure the Compute nodes that have the physical GPUs:
- Build a custom GPU-enabled overcloud image.
- Prepare the GPU role, profile, and flavor for designating Compute nodes for vGPU.
- Configure the Compute node for vGPU.
- Deploy the overcloud.
To use an NVIDIA GRID vGPU, you must comply with the NVIDIA GRID licensing requirements and you must have the URL of your self-hosted license server. For more information, see the NVIDIA License Server Release Notes web page.
9.2.1. Building a custom GPU overcloud image
Perform the following steps on the director node to install the NVIDIA GRID host driver on an overcloud Compute image and upload the image to the Image (glance) service.
Procedure
-
Log in to the undercloud as the
stack
user. Copy the overcloud image and add the
gpu
suffix to the copied image.(undercloud)$ cp overcloud-full.qcow2 overcloud-full-gpu.qcow2
Install an ISO image generator tool from YUM.
(undercloud)$ sudo yum install genisoimage -y
Download the NVIDIA GRID host driver RPM package that corresponds to your GPU device from the NVIDIA website. To determine which driver you need, see the NVIDIA Driver Downloads Portal.
NoteYou must be a registered NVIDIA customer to download the drivers from the portal.
Create an ISO image from the driver RPM package and save the image in the
nvidia-host
directory.(undercloud)$ genisoimage -o nvidia-host.iso -R -J -V NVIDIA nvidia-host/ I: -input-charset not specified, using utf-8 (detected in locale settings) 9.06% done, estimate finish Wed Oct 31 11:24:46 2018 18.08% done, estimate finish Wed Oct 31 11:24:46 2018 27.14% done, estimate finish Wed Oct 31 11:24:46 2018 36.17% done, estimate finish Wed Oct 31 11:24:46 2018 45.22% done, estimate finish Wed Oct 31 11:24:46 2018 54.25% done, estimate finish Wed Oct 31 11:24:46 2018 63.31% done, estimate finish Wed Oct 31 11:24:46 2018 72.34% done, estimate finish Wed Oct 31 11:24:46 2018 81.39% done, estimate finish Wed Oct 31 11:24:46 2018 90.42% done, estimate finish Wed Oct 31 11:24:46 2018 99.48% done, estimate finish Wed Oct 31 11:24:46 2018 Total translation table size: 0 Total rockridge attributes bytes: 358 Total directory bytes: 0 Path table size(bytes): 10 Max brk space used 0 55297 extents written (108 MB)
Create a driver installation script for your Compute nodes. This script installs the NVIDIA GRID host driver on each Compute node that you run it on. The following example creates a script named
install_nvidia.sh
:#/bin/bash # NVIDIA GRID package mkdir /tmp/mount mount LABEL=NVIDIA /tmp/mount rpm -ivh /tmp/mount/NVIDIA-vGPU-rhel-8.1-430.27.x86_64.rpm
Customize the overcloud image by attaching the ISO image that you generated in step 4, and running the driver installation script that you created in step 5:
(undercloud)$ virt-customize --attach nvidia-host.iso -a overcloud-full-gpu.qcow2 -v --run install_nvidia.sh [ 0.0] Examining the guest ... libguestfs: launch: program=virt-customize libguestfs: launch: version=1.36.10rhel=8,release=6.el8_5.2,libvirt libguestfs: launch: backend registered: unix libguestfs: launch: backend registered: uml libguestfs: launch: backend registered: libvirt
Relabel the customized image with SELinux:
(undercloud)$ virt-customize -a overcloud-full-gpu.qcow2 --selinux-relabel [ 0.0] Examining the guest ... [ 2.2] Setting a random seed [ 2.2] SELinux relabelling [ 27.4] Finishing off
Prepare the custom image files for upload to the Image service:
(undercloud)$ mkdir /var/image/x86_64/image (undercloud)$ guestmount -a overcloud-full-gpu.qcow2 -i --ro image (undercloud)$ cp image/boot/vmlinuz-3.10.0-862.14.4.el8.x86_64 ./overcloud-full-gpu.vmlinuz (undercloud)$ cp image/boot/initramfs-3.10.0-862.14.4.el8.x86_64.img ./overcloud-full-gpu.initrd
Upload the custom image to the Image service:
(undercloud)$ openstack overcloud image upload --update-existing --os-image-name overcloud-full-gpu.qcow2
9.2.2. Designating Compute nodes for vGPU
To designate Compute nodes for vGPU workloads, you must create a new role file to configure the vGPU role, and configure a new flavor to use to tag the GPU-enabled Compute nodes.
Procedure
Generate a new roles data file named
roles_data_gpu.yaml
that includes theController
,Compute
, andComputeGpu
roles:(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_gpu.yaml \ Compute:ComputeGpu Compute Controller
Open
roles_data_gpu.yaml
and edit or add the following parameters and sections:Section/Parameter Current value New value Role comment
Role: Compute
Role: ComputeGpu
Role name
name: Compute
name: ComputeGpu
description
Basic Compute Node role
GPU Compute Node role
ImageDefault
n/a
overcloud-full-gpu
HostnameFormatDefault
-compute-
-computegpu-
deprecated_nic_config_name
compute.yaml
compute-gpu.yaml
-
Register the GPU-enabled Compute nodes for the overcloud by adding them to your node definition template,
node.json
ornode.yaml
. For more information, see Registering nodes for the overcloud in the Director Installation and Usage guide. Inspect the node hardware:
(undercloud)$ openstack overcloud node introspect --all-manageable \ --provide
For more information, see Inspecting the hardware of nodes in the Director Installation and Usage guide.
Create the
compute-vgpu-nvidia
bare metal flavor to use to tag nodes that you want to designate for vGPU workloads:(undercloud)$ openstack flavor create --id auto \ --ram <ram_size_mb> --disk <disk_size_gb> \ --vcpus <no_vcpus> compute-vgpu-nvidia
-
Replace
<ram_size_mb>
with the RAM of the bare metal node, in MB. -
Replace
<disk_size_gb>
with the size of the disk on the bare metal node, in GB. Replace
<no_vcpus>
with the number of CPUs on the bare metal node.NoteThese properties are not used for scheduling instances. However, the Compute scheduler does use the disk size to determine the root partition size.
-
Replace
Tag each bare metal node that you want to designate for GPU workloads with a custom GPU resource class:
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.GPU <node>
Replace
<node>
with the ID of the baremetal node.Associate the
compute-vgpu-nvidia
flavor with the custom GPU resource class:(undercloud)$ openstack flavor set \ --property resources:CUSTOM_BAREMETAL_GPU=1 \ compute-vgpu-nvidia
To determine the name of a custom resource class that corresponds to a resource class of a Bare Metal service node, convert the resource class to uppercase, replace all punctuation with an underscore, and prefix with
CUSTOM_
.NoteA flavor can request only one instance of a bare metal resource class.
Set the following flavor properties to prevent the Compute scheduler from using the bare metal flavor properties to schedule instances:
(undercloud)$ openstack flavor set \ --property resources:VCPU=0 --property resources:MEMORY_MB=0 \ --property resources:DISK_GB=0 compute-vgpu-nvidia
To verify that the role was created, enter the following command:
(undercloud)$ openstack overcloud profiles list
9.2.3. Configuring the Compute node for vGPU and deploying the overcloud
You need to retrieve and assign the vGPU type that corresponds to the physical GPU device in your environment, and prepare the environment files to configure the Compute node for vGPU.
Procedure
- Install Red Hat Enterprise Linux and the NVIDIA GRID driver on a temporary Compute node and launch the node. For more information about installing the NVIDIA GRID driver, see Building a custom GPU overcloud image.
On the Compute node, locate the vGPU type of the physical GPU device that you want to enable. For libvirt, virtual GPUs are mediated devices, or
mdev
type devices. To discover the supportedmdev
devices, enter the following command:[root@overcloud-computegpu-0 ~]# ls /sys/class/mdev_bus/0000\:06\:00.0/mdev_supported_types/ nvidia-11 nvidia-12 nvidia-13 nvidia-14 nvidia-15 nvidia-16 nvidia-17 nvidia-18 nvidia-19 nvidia-20 nvidia-21 nvidia-210 nvidia-22 [root@overcloud-computegpu-0 ~]# cat /sys/class/mdev_bus/0000\:06\:00.0/mdev_supported_types/nvidia-18/description num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=4096x2160, max_instance=4
Register the
Net::SoftwareConfig
of theComputeGpu
role in yournetwork-environment.yaml
file:resource_registry: OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml OS::TripleO::ComputeGpu::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute-gpu.yaml OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml
Add the following parameters to the
node-info.yaml
file to specify the number of GPU Compute nodes, and the flavor to use for the GPU-designated Compute nodes:parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeFlavor: compute OvercloudComputeGpuFlavor: compute-vgpu-nvidia ControllerCount: 1 ComputeCount: 0 ComputeGpuCount: 1
Create a
gpu.yaml
file to specify the vGPU type of your GPU device:parameter_defaults: ComputeGpuExtraConfig: nova::compute::vgpu::enabled_vgpu_types: - nvidia-18
NoteEach physical GPU supports only one virtual GPU type. If you specify multiple vGPU types in this property, only the first type is used.
Add your new role and environment files to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -r /home/stack/templates/roles_data_gpu.yaml \ -e /home/stack/templates/network-environment.yaml \ -e /home/stack/templates/gpu.yaml -e /home/stack/templates/node-info.yaml \
9.3. Creating a custom GPU instance image
To enable your cloud users to create instances that use a virtual GPU (vGPU), you can create a custom vGPU-enabled image for launching instances. Use the following procedure to create a custom vGPU-enabled instance image with the NVIDIA GRID guest driver and license file.
Prerequisites
- You have configured and deployed the overcloud with GPU-enabled Compute nodes.
Procedure
-
Log in to the undercloud as the
stack
user. Source the
overcloudrc
credential file:$ source ~/overcloudrc
Create an instance with the hardware and software profile that your vGPU instances require:
(overcloud)$ openstack server create --flavor <flavor> \ --image <image> temp_vgpu_instance
-
Replace
<flavor>
with the name or ID of the flavor that has the hardware profile that your vGPU instances require. For information about creating a vGPU flavor, see Creating a vGPU flavor for instances. -
Replace
<image>
with the name or ID of the image that has the software profile that your vGPU instances require. For information about downloading RHEL cloud images, see Image service.
-
Replace
- Log in to the instance as a cloud-user. For more information, see Log in to an Instance.
-
Create the
gridd.conf
NVIDIA GRID license file on the instance, following the NVIDIA guidance: Licensing an NVIDIA vGPU on Linux by Using a Configuration File. Install the GPU driver on the instance. For more information about installing an NVIDIA driver, see Installing the NVIDIA vGPU Software Graphics Driver on Linux.
NoteUse the
hw_video_model
image property to define the GPU driver type. You can choosenone
if you want to disable the emulated GPUs for your vGPU instances. For more information about supported drivers, see Image metadata.Create an image snapshot of the instance:
(overcloud)$ openstack server image create \ --name vgpu_image temp_vgpu_instance
- Optional: Delete the instance.
9.4. Creating a vGPU flavor for instances
To enable your cloud users to create instances for GPU workloads, you can create a GPU flavor that can be used to launch vGPU instances, and assign the vGPU resource to that flavor.
Prerequisites
- You have configured and deployed the overcloud with GPU-designated Compute nodes.
Procedure
Create an NVIDIA GPU flavor, for example:
(overcloud)$ openstack flavor create --vcpus 6 \ --ram 8192 --disk 100 m1.small-gpu +----------------------------+--------------------------------------+ | Field | Value | +----------------------------+--------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | disk | 100 | | id | a27b14dd-c42d-4084-9b6a-225555876f68 | | name | m1.small-gpu | | os-flavor-access:is_public | True | | properties | | | ram | 8192 | | rxtx_factor | 1.0 | | swap | | | vcpus | 6 | +----------------------------+--------------------------------------+
Assign a vGPU resource to the flavor that you created. You can assign only one vGPU for each instance.
(overcloud)$ openstack flavor set m1.small-gpu \ --property "resources:VGPU=1" (overcloud)$ openstack flavor show m1.small-gpu +----------------------------+--------------------------------------+ | Field | Value | +----------------------------+--------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | disk | 100 | | id | a27b14dd-c42d-4084-9b6a-225555876f68 | | name | m1.small-gpu | | os-flavor-access:is_public | True | | properties | resources:VGPU='1' | | ram | 8192 | | rxtx_factor | 1.0 | | swap | | | vcpus | 6 | +----------------------------+--------------------------------------+
9.5. Launching a vGPU instance
You can create a GPU-enabled instance for GPU workloads.
Procedure
Create an instance using a GPU flavor and image, for example:
(overcloud)$ openstack server create --flavor m1.small-gpu \ --image vgpu_image --security-group web --nic net-id=internal0 \ --key-name lambda vgpu-instance
- Log in to the instance as a cloud-user. For more information, see Log in to an Instance.
To verify that the GPU is accessible from the instance, enter the following command from the instance:
$ lspci -nn | grep <gpu_name>
9.6. Enabling PCI passthrough for a GPU device
You can use PCI passthrough to attach a physical PCI device, such as a graphics card, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.
Prerequisites
-
The
pciutils
package is installed on the physical servers that have the PCI cards. - The driver for the GPU device must be installed on the instance that the device is passed through to. Therefore, you need to have created a custom instance image that has the required GPU driver installed. For more information about how to create a custom instance image with the GPU driver installed, see Creating a custom GPU instance image.
Procedure
To determine the vendor ID and product ID for each passthrough device type, enter the following command on the physical server that has the PCI cards:
# lspci -nn | grep -i <gpu_name>
For example, to determine the vendor and product ID for an NVIDIA GPU, enter the following command:
# lspci -nn | grep -i nvidia 3b:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1) d8:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1db4] (rev a1)
To determine if each PCI device has Single Root I/O Virtualization (SR-IOV) capabilities, enter the following command on the physical server that has the PCI cards:
# lspci -v -s 3b:00.0 3b:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1) ... Capabilities: [bcc] Single Root I/O Virtualization (SR-IOV) ...
-
To configure the Controller node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_controller.yaml
. Add
PciPassthroughFilter
to theNovaSchedulerDefaultFilters
parameter inpci_passthru_controller.yaml
:parameter_defaults: NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']
To specify the PCI alias for the devices on the Controller node, add the following configuration to
pci_passthru_controller.yaml
:If the PCI device has SR-IOV capabilities:
ControllerExtraConfig: nova::pci::aliases: - name: "t4" product_id: "1eb8" vendor_id: "10de" device_type: "type-PF" - name: "v100" product_id: "1db4" vendor_id: "10de" device_type: "type-PF"
If the PCI device does not have SR-IOV capabilities:
ControllerExtraConfig: nova::pci::aliases: - name: "t4" product_id: "1eb8" vendor_id: "10de" - name: "v100" product_id: "1db4" vendor_id: "10de"
For more information on configuring the
device_type
field, see PCI passthrough device type field.NoteIf the
nova-api
service is running in a role other than the Controller, then replaceControllerExtraConfig
with the user role, in the format<Role>ExtraConfig
.
-
To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_compute.yaml
. To specify the available PCIs for the devices on the Compute node, add the following to
pci_passthru_compute.yaml
:parameter_defaults: NovaPCIPassthrough: - vendor_id: "10de" product_id: "1eb8"
You must create a copy of the PCI alias on the Compute node for instance migration and resize operations. To specify the PCI alias for the devices on the Compute node, add the following to
pci_passthru_compute.yaml
:If the PCI device has SR-IOV capabilities:
ComputeExtraConfig: nova::pci::aliases: - name: "t4" product_id: "1eb8" vendor_id: "10de" device_type: "type-PF" - name: "v100" product_id: "1db4" vendor_id: "10de" device_type: "type-PF"
If the PCI device does not have SR-IOV capabilities:
ComputeExtraConfig: nova::pci::aliases: - name: "t4" product_id: "1eb8" vendor_id: "10de" - name: "v100" product_id: "1db4" vendor_id: "10de"
NoteThe Compute node aliases must be identical to the aliases on the Controller node.
To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the
KernelArgs
parameter topci_passthru_compute.yaml
:parameter_defaults: ... ComputeParameters: KernelArgs: "intel_iommu=on iommu=pt"
Add your custom environment files to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/pci_passthru_controller.yaml \ -e /home/stack/templates/pci_passthru_compute.yaml
Configure a flavor to request the PCI devices. The following example requests two devices, each with a vendor ID of
10de
and a product ID of13f2
:# openstack flavor set m1.large \ --property "pci_passthrough:alias"="t4:2"
Verification
Create an instance with a PCI passthrough device:
# openstack server create --flavor m1.large \ --image <custom_gpu> --wait test-pci
Replace
<custom_gpu>
with the name of your custom instance image that has the required GPU driver installed.- Log in to the instance as a cloud user.
To verify that the GPU is accessible from the instance, enter the following command from the instance:
$ lspci -nn | grep <gpu_name>
To check the NVIDIA System Management Interface status, enter the following command from the instance:
$ nvidia-smi
Example output:
-----------------------------------------------------------------------------
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 | |-----------------------------------------------------
----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |=====================================================
======================| | 0 Tesla T4 Off | 00000000:01:00.0 Off | 0 | | N/A 43C P0 20W / 70W | 0MiB / 15109MiB | 0% Default |-------------------------------
--------------------------------------------
-----------------------------------------------------------------------------
| Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found |-----------------------------------------------------------------------------
Chapter 10. Configuring real-time compute
As a cloud administrator, you might need instances on your Compute nodes to adhere to low-latency policies and perform real-time processing. Real-time Compute nodes include a real-time capable kernel, specific virtualization modules, and optimized deployment parameters, to facilitate real-time processing requirements and minimize latency.
The process to enable Real-time Compute includes:
- configuring the BIOS settings of the Compute nodes
- building a real-time image with real-time kernel and Real-Time KVM (RT-KVM) kernel module
-
assigning the
ComputeRealTime
role to the Compute nodes
For a use-case example of real-time Compute deployment for NFV workloads, see the Example: Configuring OVS-DPDK with ODL and VXLAN tunnelling section in the Network Functions Virtualization Planning and Configuration Guide.
Real-time Compute nodes are supported only with Red Hat Enterprise Linux version 7.5 or later.
10.1. Preparing Compute nodes for real-time
Before you can deploy Real-time Compute in your overcloud, you must enable Red Hat Enterprise Linux Real-Time KVM (RT-KVM), configure your BIOS to support real-time, and build the real-time overcloud image.
Prerequisites
- You must use Red Hat certified servers for your RT-KVM Compute nodes. See Red Hat Enterprise Linux for Real Time 7 certified servers for details.
-
You need a separate subscription to Red Hat OpenStack Platform for Real Time to access the
rhel-8-for-x86_64-nfv-rpms
repository. For details on managing repositories and subscriptions for your undercloud, see the Registering and updating your undercloud section in the Director Installation and Usage guide.
Procedure
To build the real-time overcloud image, you must enable the
rhel-8-for-x86_64-nfv-rpms
repository for RT-KVM. To check which packages will be installed from the repository, enter the following command:$ dnf repo-pkgs rhel-8-for-x86_64-nfv-rpms list Loaded plugins: product-id, search-disabled-repos, subscription-manager Available Packages kernel-rt.x86_64 4.18.0-80.7.1.rt9.153.el8_0 rhel-8-for-x86_64-nfv-rpms kernel-rt-debug.x86_64 4.18.0-80.7.1.rt9.153.el8_0 rhel-8-for-x86_64-nfv-rpms kernel-rt-debug-devel.x86_64 4.18.0-80.7.1.rt9.153.el8_0 rhel-8-for-x86_64-nfv-rpms kernel-rt-debug-kvm.x86_64 4.18.0-80.7.1.rt9.153.el8_0 rhel-8-for-x86_64-nfv-rpms kernel-rt-devel.x86_64 4.18.0-80.7.1.rt9.153.el8_0 rhel-8-for-x86_64-nfv-rpms kernel-rt-doc.noarch 4.18.0-80.7.1.rt9.153.el8_0 rhel-8-for-x86_64-nfv-rpms kernel-rt-kvm.x86_64 4.18.0-80.7.1.rt9.153.el8_0 rhel-8-for-x86_64-nfv-rpms [ output omitted…]
To build the overcloud image for Real-time Compute nodes, install the
libguestfs-tools
package on the undercloud to get thevirt-customize
tool:(undercloud)$ sudo dnf install libguestfs-tools
ImportantIf you install the
libguestfs-tools
package on the undercloud, disableiscsid.socket
to avoid port conflicts with thetripleo_iscsid
service on the undercloud:$ sudo systemctl disable --now iscsid.socket
Extract the images:
(undercloud)$ tar -xf /usr/share/rhosp-director-images/overcloud-full.tar (undercloud)$ tar -xf /usr/share/rhosp-director-images/ironic-python-agent.tar
Copy the default image:
(undercloud)$ cp overcloud-full.qcow2 overcloud-realtime-compute.qcow2
Register the image and configure the required subscriptions:
(undercloud)$ virt-customize -a overcloud-realtime-compute.qcow2 --run-command 'subscription-manager register --username=<username> --password=<password>' [ 0.0] Examining the guest ... [ 10.0] Setting a random seed [ 10.0] Running: subscription-manager register --username=<username> --password=<password> [ 24.0] Finishing off
Replace the
username
andpassword
values with your Red Hat customer account details.For general information about building a Real-time overcloud image, see the knowledgebase article Modifying the Red Hat Enterprise Linux OpenStack Platform Overcloud Image with virt-customize.
Find the SKU of the Red Hat OpenStack Platform for Real Time subscription. The SKU might be located on a system that is already registered to the Red Hat Subscription Manager with the same account and credentials:
$ sudo subscription-manager list
Attach the Red Hat OpenStack Platform for Real Time subscription to the image:
(undercloud)$ virt-customize -a overcloud-realtime-compute.qcow2 --run-command 'subscription-manager attach --pool [subscription-pool]'
Create a script to configure
rt
on the image:(undercloud)$ cat rt.sh #!/bin/bash set -eux subscription-manager repos --enable=[REPO_ID] dnf -v -y --setopt=protected_packages= erase kernel.$(uname -m) dnf -v -y install kernel-rt kernel-rt-kvm tuned-profiles-nfv-host # END OF SCRIPT
Run the script to configure the real-time image:
(undercloud)$ virt-customize -a overcloud-realtime-compute.qcow2 -v --run rt.sh 2>&1 | tee virt-customize.log
Re-label SELinux:
(undercloud)$ virt-customize -a overcloud-realtime-compute.qcow2 --selinux-relabel
Extract
vmlinuz
andinitrd
. For example:(undercloud)$ mkdir image (undercloud)$ guestmount -a overcloud-realtime-compute.qcow2 -i --ro image (undercloud)$ cp image/boot/vmlinuz-4.18.0-80.7.1.rt9.153.el8_0.x86_64 ./overcloud-realtime-compute.vmlinuz (undercloud)$ cp image/boot/initramfs-4.18.0-80.7.1.rt9.153.el8_0.x86_64.img ./overcloud-realtime-compute.initrd (undercloud)$ guestunmount image
NoteThe software version in the
vmlinuz
andinitramfs
filenames vary with the kernel version.Upload the image:
(undercloud)$ openstack overcloud image upload \ --update-existing --os-image-name overcloud-realtime-compute.qcow2
You now have a real-time image you can use with the
ComputeRealTime
composable role on select Compute nodes.To reduce latency on your Real-time Compute nodes, you must modify the BIOS settings in the Compute nodes. You should disable all options for the following components in your Compute node BIOS settings:
- Power Management
- Hyper-Threading
- CPU sleep states
Logical processors
See Setting BIOS parameters for descriptions of these settings and the impact of disabling them. See your hardware manufacturer documentation for complete details on how to change BIOS settings.
10.2. Deploying the Real-time Compute role
Red Hat OpenStack Platform (RHOSP) director provides the template for the ComputeRealTime
role, which you can use to deploy real-time Compute nodes. You must perform additional steps to designate Compute nodes for real-time.
Procedure
Based on the
/usr/share/openstack-tripleo-heat-templates/environments/compute-real-time-example.yaml
file, create acompute-real-time.yaml
environment file that sets the parameters for theComputeRealTime
role.cp /usr/share/openstack-tripleo-heat-templates/environments/compute-real-time-example.yaml /home/stack/templates/compute-real-time.yaml
The file must include values for the following parameters:
-
IsolCpusList
andNovaComputeCpuDedicatedSet
: List of isolated CPU cores and virtual CPU pins to reserve for real-time workloads. This value depends on the CPU hardware of your real-time Compute nodes. -
NovaComputeCpuSharedSet
: List of host CPUs to reserve for emulator threads. -
KernelArgs
: Arguments to pass to the kernel of the Real-time Compute nodes. For example, you can usedefault_hugepagesz=1G hugepagesz=1G hugepages=<number_of_1G_pages_to_reserve> hugepagesz=2M hugepages=<number_of_2M_pages>
to define the memory requirements of guests that have huge pages with multiple sizes. In this example, the default size is 1GB but you can also reserve 2M huge pages.
-
Add the
ComputeRealTime
role to your roles data file and regenerate the file. For example:$ openstack overcloud roles generate -o /home/stack/templates/rt_roles_data.yaml Controller Compute ComputeRealTime
This command generates a
ComputeRealTime
role with contents similar to the following example, and also sets theImageDefault
option toovercloud-realtime-compute
.- name: ComputeRealTime description: | Compute role that is optimized for real-time behaviour. When using this role it is mandatory that an overcloud-realtime-compute image is available and the role specific parameters IsolCpusList, NovaComputeCpuDedicatedSet and NovaComputeCpuSharedSet are set accordingly to the hardware of the real-time compute nodes. CountDefault: 1 networks: InternalApi: subnet: internal_api_subnet Tenant: subnet: tenant_subnet Storage: subnet: storage_subnet HostnameFormatDefault: '%stackname%-computerealtime-%index%' ImageDefault: overcloud-realtime-compute RoleParametersDefault: TunedProfileName: "realtime-virtual-host" KernelArgs: "" # these must be set in an environment file IsolCpusList: "" # or similar according to the hardware NovaComputeCpuDedicatedSet: "" # of real-time nodes NovaComputeCpuSharedSet: "" # NovaLibvirtMemStatsPeriodSeconds: 0 ServicesDefault: - OS::TripleO::Services::Aide - OS::TripleO::Services::AuditD - OS::TripleO::Services::BootParams - OS::TripleO::Services::CACerts - OS::TripleO::Services::CephClient - OS::TripleO::Services::CephExternal - OS::TripleO::Services::CertmongerUser - OS::TripleO::Services::Collectd - OS::TripleO::Services::ComputeCeilometerAgent - OS::TripleO::Services::ComputeNeutronCorePlugin - OS::TripleO::Services::ComputeNeutronL3Agent - OS::TripleO::Services::ComputeNeutronMetadataAgent - OS::TripleO::Services::ComputeNeutronOvsAgent - OS::TripleO::Services::Docker - OS::TripleO::Services::Fluentd - OS::TripleO::Services::IpaClient - OS::TripleO::Services::Ipsec - OS::TripleO::Services::Iscsid - OS::TripleO::Services::Kernel - OS::TripleO::Services::LoginDefs - OS::TripleO::Services::MetricsQdr - OS::TripleO::Services::MySQLClient - OS::TripleO::Services::NeutronBgpVpnBagpipe - OS::TripleO::Services::NeutronLinuxbridgeAgent - OS::TripleO::Services::NeutronVppAgent - OS::TripleO::Services::NovaCompute - OS::TripleO::Services::NovaLibvirt - OS::TripleO::Services::NovaLibvirtGuests - OS::TripleO::Services::NovaMigrationTarget - OS::TripleO::Services::ContainersLogrotateCrond - OS::TripleO::Services::OpenDaylightOvs - OS::TripleO::Services::Podman - OS::TripleO::Services::Rhsm - OS::TripleO::Services::RsyslogSidecar - OS::TripleO::Services::Securetty - OS::TripleO::Services::SensuClient - OS::TripleO::Services::SkydiveAgent - OS::TripleO::Services::Snmp - OS::TripleO::Services::Sshd - OS::TripleO::Services::Timesync - OS::TripleO::Services::Timezone - OS::TripleO::Services::TripleoFirewall - OS::TripleO::Services::TripleoPackages - OS::TripleO::Services::Vpp - OS::TripleO::Services::OVNController - OS::TripleO::Services::OVNMetadataAgent
For general information about custom roles and about the
roles-data.yaml
, see Roles.Create the
compute-realtime
flavor to tag nodes that you want to designate for real-time workloads. For example:$ source ~/stackrc $ openstack flavor create --id auto --ram 6144 --disk 40 --vcpus 4 compute-realtime $ openstack flavor set --property "cpu_arch"="x86_64" --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute-realtime" compute-realtime
Tag each node that you want to designate for real-time workloads with the
compute-realtime
profile.$ openstack baremetal node set --property capabilities='profile:compute-realtime,boot_option:local' <NODE UUID>
Map the
ComputeRealTime
role to thecompute-realtime
flavor by creating an environment file with the following content:parameter_defaults: OvercloudComputeRealTimeFlavor: compute-realtime
Add your environment files and the new roles file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -r /home/stack/templates/rt~/my_roles_data.yaml \ -e home/stack/templates/compute-real-time.yaml
10.3. Sample deployment and testing scenario
The following example procedure uses a simple single-node deployment to test that the environment variables and other supporting configuration is set up correctly. Actual performance results might vary, depending on the number of nodes and guests that you deploy in your cloud.
Procedure
Create the
compute-real-time.yaml
file with the following parameters:parameter_defaults: ComputeRealTimeParameters: IsolCpusList: "1" NovaComputeCpuDedicatedSet: "1" NovaComputeCpuSharedSet: "0" KernelArgs: "default_hugepagesz=1G hugepagesz=1G hugepages=16"
Create a new
rt_roles_data.yaml
file with theComputeRealTime
role:$ openstack overcloud roles generate \ -o ~/rt_roles_data.yaml Controller ComputeRealTime
Deploy the overcloud, adding both your new real-time roles data file and your real-time environment file to the stack along with your other environment files:
(undercloud)$ openstack overcloud deploy --templates \ -r /home/stack/rt_roles_data.yaml -e [your environment files] -e /home/stack/templates/compute-real-time.yaml
This command deploys one Controller node and one Real-time Compute node.
Log into the Real-time Compute node and check the following parameters. Replace
<...>
with the values of the relevant parameters from thecompute-real-time.yaml
.[root@overcloud-computerealtime-0 ~]# uname -a Linux overcloud-computerealtime-0 4.18.0-80.7.1.rt9.153.el8_0.x86_64 #1 SMP PREEMPT RT Wed Dec 13 13:37:53 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux [root@overcloud-computerealtime-0 ~]# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-4.18.0-80.7.1.rt9.153.el8_0.x86_64 root=UUID=45ae42d0-58e7-44fe-b5b1-993fe97b760f ro console=tty0 crashkernel=auto console=ttyS0,115200 default_hugepagesz=1G hugepagesz=1G hugepages=16 [root@overcloud-computerealtime-0 ~]# tuned-adm active Current active profile: realtime-virtual-host [root@overcloud-computerealtime-0 ~]# grep ^isolated_cores /etc/tuned/realtime-virtual-host-variables.conf isolated_cores=<IsolCpusList> [root@overcloud-computerealtime-0 ~]# cat /usr/lib/tuned/realtime-virtual-host/lapic_timer_adv_ns X (X != 0) [root@overcloud-computerealtime-0 ~]# cat /sys/module/kvm/parameters/lapic_timer_advance_ns X (X != 0) [root@overcloud-computerealtime-0 ~]# cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages X (X != 0) [root@overcloud-computerealtime-0 ~]# crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set <NovaComputeCpuDedicatedSet> [root@overcloud-computerealtime-0 ~]# crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set <NovaComputeCpuSharedSet>
10.4. Launching and tuning real-time instances
After you deploy and configure Real-time Compute nodes, you can launch real-time instances on those nodes. You can further configure these real-time instances with CPU pinning, NUMA topology filters, and huge pages.
Prerequisites
-
The
compute-realtime
flavor exists on the overcloud, as described in Deploying the Real-time Compute role.
Procedure
Launch the real-time instance:
# openstack server create --image <rhel> \ --flavor r1.small --nic net-id=<dpdk-net> test-rt
Optional: Verify that the instance uses the assigned emulator threads:
# virsh dumpxml <instance-id> | grep vcpu -A1 <vcpu placement='static'>4</vcpu> <cputune> <vcpupin vcpu='0' cpuset='1'/> <vcpupin vcpu='1' cpuset='3'/> <vcpupin vcpu='2' cpuset='5'/> <vcpupin vcpu='3' cpuset='7'/> <emulatorpin cpuset='0-1'/> <vcpusched vcpus='2-3' scheduler='fifo' priority='1'/> </cputune>
Pinning CPUs and setting emulator thread policy
To ensure that there are enough CPUs on each Real-time Compute node for real-time workloads, you need to pin at least one virtual CPU (vCPU) for an instance to a physical CPU (pCPUs) on the host. The emulator threads for that vCPU then remain dedicated to that pCPU.
Configure your flavor to use a dedicated CPU policy. To do so, set the hw:cpu_policy
parameter to dedicated
on the flavor. For example:
# openstack flavor set --property hw:cpu_policy=dedicated 99
Make sure that your resources quota has enough pCPUs for the Real-time Compute nodes to consume.
Optimizing your network configuration
Depending on the needs of your deployment, you might need to set parameters in the network-environment.yaml
file to tune your network for certain real-time workloads.
To review an example configuration optimized for OVS-DPDK, see the Configuring the OVS-DPDK parameters section of the Network Functions Virtualization Planning and Configuration Guide.
Configuring huge pages
It is recommended to set the default huge pages size to 1GB. Otherwise, TLB flushes might create jitter in the vCPU execution. For general information about using huge pages, see the Running DPDK applications web page.
Disabling Performance Monitoring Unit (PMU) emulation
Instances can provide PMU metrics by specifying an image or flavor with a vPMU. Providing PMU metrics introduces latency.
The vPMU defaults to enabled when NovaLibvirtCPUMode
is set to host-passthrough
.
If you do not need PMU metrics, then disable the vPMU to reduce latency by setting the PMU property to "False" in the image or flavor used to create the instance:
-
Image:
hw_pmu=False
-
Flavor:
hw:pmu=False
Chapter 11. Scaling deployments with Compute cells
You can use cells to divide Compute nodes in large deployments into groups, each with a message queue and dedicated database that contains instance information.
By default, director installs the overcloud with a single cell for all Compute nodes. This cell contains all the Compute services and databases, and all the instances and instance metadata. For larger deployments, you can deploy the overcloud with multiple cells to accommodate a larger number of Compute nodes. You can add cells to your environment when you install a new overcloud or at any time afterwards.
In multi-cell deployments, each cell runs standalone copies of the cell-specific Compute services and databases, and stores instance metadata only for instances in that cell. Global information and cell mappings are stored in the global Controller cell, which helps with security and recovery in case one of the cells fails.
If you add cells to an existing overcloud, the conductor in the default cell also performs the role of the super conductor. This has a negative effect on conductor communication with the cells in the deployment, and on the performance of the overcloud. Also, if you take the default cell offline, you take the super conductor offline as well, which stops the entire overcloud deployment. Therefore, to scale an existing overcloud, do not add any Compute nodes to the default cell. Instead, add Compute nodes to the new cells you create, allowing the default cell to act as the super conductor.
To deploy a multi-cell overcloud you must complete the following stages:
- Configure your RHOSP deployment to handle multiple cells.
- Create and provision the new cells that you require within your deployment.
- Add Compute nodes to each cell.
11.1. Global components and services
The following components are deployed in a Controller cell once for each overcloud, regardless of the number of Compute cells.
- Compute API
- Provides the external REST API to users.
- Compute scheduler
- Determines on which Compute node to assign the instances.
- Placement service
- Monitors and allocates Compute resources to the instances.
- API database
Used by the Compute API and the Compute scheduler services to track location information about instances, and provides a temporary location for instances that are built but not scheduled.
In multi-cell deployments, this database also contains cell mappings that specify the database connection for each cell.
cell0
database- Dedicated database for information about instances that failed to be scheduled.
- Super conductor
-
This service exists only in multi-cell deployments to coordinate between the global services and each Compute cell. This service also sends failed instance information to the
cell0
database.
11.2. Cell-specific components and services
The following components are deployed in each Compute cell.
- Cell database
- Contains most of the information about instances. Used by the global API, the conductor, and the Compute services.
- Conductor
- Coordinates database queries and long-running tasks from the global services, and insulates Compute nodes from direct database access.
- Message queue
- Messaging service used by all services to communicate with each other within the cell and with the global services.
11.3. Cell deployments architecture
The default overcloud that director installs has a single cell for all Compute nodes. You can scale your overcloud by adding more cells, as illustrated by the following architecture diagrams.
Single-cell deployment architecture
The following diagram shows an example of the basic structure and interaction in a default single-cell overcloud.
In this deployment, all services are configured to use a single conductor to communicate between the Compute API and the Compute nodes, and a single database stores all live instance data.
In smaller deployments this configuration might be sufficient, but if any global API service or database fails, the entire Compute deployment cannot send or receive information, regardless of high availability configurations.
Multi-cell deployment architecture
The following diagram shows an example of the basic structure and interaction in a custom multi-cell overcloud.
In this deployment, the Compute nodes are divided to multiple cells, each with their own conductor, database, and message queue. The global services use the super conductor to communicate with each cell, and the global database contains only information required for the whole overcloud.
The cell-level services cannot access global services directly. This isolation provides additional security and fail-safe capabilities in case of cell failure.
In Edge deployments, you must deploy the first cell on the central site, therefore, do not deploy the first cell on any of the edge sites. Do not run any Compute services on the first cell. Instead, deploy each new cell containing the Compute nodes separately on the edge sites.
11.4. Considerations for multi-cell deployments
- Maximum number of Compute nodes in a multi-cell deployment
- The maximum number of Compute nodes is 500 across all cells.
- SSL/TLS
- You cannot enable SSL/TLS on the overcloud.
- Cross-cell instance migrations
Migrating an instance from a host in one cell to a host in another cell is not supported. This limitation affects the following operations:
- cold migration
- live migration
- unshelve
- resize
- evacuation
- Service quotas
Compute service quotas are calculated dynamically at each resource consumption point, instead of statically in the database. In multi-cell deployments, unreachable cells cannot provide usage information in real-time, which might cause the quotas to be exceeded when the cell is reachable again.
You can use the Placement service and API database to configure the quota calculation to withstand failed or unreachable cells.
- API database
- The Compute API database is always global for all cells and cannot be duplicated for each cell.
- Console proxies
-
You must configure console proxies for each cell, because console token authorizations are stored in cell databases. Each console proxy server needs to access the
database.connection
information of the corresponding cell database. - Compute metadata API
You can run the Compute metadata API globally or in each cell. Choose one of the following:
-
If you have networks that cover multiple cells, you need to run the metadata API globally so that it can bridge between the cells. In this case, the metadata API needs to access the
api_database.connection
information. -
If you have networks in separate segments for each cell, you can run the metadata API separately in each cell. This configuration can improve performance and data isolation. In this case,
neutron-metadata-agent
service point to the correspondingnova-api-metadata
service.
You use the
api.local_metadata_per_cell
configuration option to set which method to implement. For details on configuring this option, see the Create environment files with cell parameters section in Deploying a multi-cell overcloud.-
If you have networks that cover multiple cells, you need to run the metadata API globally so that it can bridge between the cells. In this case, the metadata API needs to access the
11.5. Deploying a multi-cell overcloud
To configure your RHOSP deployment to handle multiple cells, you must complete the following stages:
- Extract parameter information from the default first cell in the basic overcloud. This cell becomes the global Controller after you redeploy the overcloud.
- Configure a custom role and flavor for each cell.
- Create environment files to configure cell-specific parameters.
- Deploy the overcloud with the new cell stack.
- This process adds one cell to the overcloud. Repeat these steps for each additional cell you want to deploy in the overcloud.
-
In this procedure, the name of the new cell is
cell1
. Replace the name in all commands with the actual cell name.
Prerequisites
- You have deployed a basic overcloud with the required number of Controller and Compute nodes.
Procedure: Extract parameter information from the overcloud
Create a new directory for the new cell, and export the directory to the DIR environment variable. The example used throughout this procedure is
cell1
:$ source ~/stackrc (undercloud)$ mkdir cell1 (undercloud)$ export DIR=cell1
Export the default cell configuration and password information from the overcloud to a new environment file for the cell:
(undercloud)$ openstack overcloud cell export \ --output-file cell1/cell1-ctrl-input.yaml cell1
NoteIf the environment file already exists, enter the command with the
--force-overwrite
or-f
option.This command exports the
EndpointMap
,HostsEntry
,AllNodesConfig
,GlobalConfig
parameters, and the password information, to the new environment file for the cell,cell1/cell1-ctrl-input.yaml
.
Procedure: Configure a custom role for a cell
Generate a new roles data file named
cell_roles_data.yaml
that includes the Compute and CellController roles:(undercloud)$ openstack overcloud roles generate \ --roles-path /usr/share/openstack-tripleo-heat-templates/roles \ -o $DIR/cell_roles_data.yaml Compute CellController
Optional: To divide your network between the global Controller and the cells, configure network access in the roles file that you created using segmented network role parameters:
name: Compute description: | Basic Compute Node role CountDefault: 1 # Create external Neutron bridge (unset if using ML2/OVS without DVR) tags: - external_bridge networks: InternalApi: subnet: internal_api_cell1 Tenant: subnet: tenant_subnet Storage: subnet: storage_cell1 ... - name: CellController description: | CellController role for the nova cell_v2 controller services CountDefault: 1 tags: - primary - controller networks: External: subnet: external_cell1 InternalApi: subnet: internal_api_cell1 Storage: subnet: storage_cell1 StorageMgmt: subnet: storage_mgmt_cell1 Tenant: subnet: tenant_subnet
Procedure: Configure the cell flavor and tag nodes to a cell
Create the
cellcontroller
flavor to use to tag nodes that you want to allocate to the cell. For example:(undercloud)$ openstack flavor create --id auto --ram 4096 \ --disk 40 --vcpus 1 cellcontroller (undercloud)$ openstack flavor set --property "cpu_arch"="x86_64" \ --property "capabilities:boot_option"="local" \ --property "capabilities:profile"="cellcontroller" \ --property "resources:CUSTOM_BAREMETAL=1" \ --property "resources:DISK_GB=0" \ --property "resources:MEMORY_MB=0" \ --property "resources:VCPU=0" \ cellcontroller
Tag each node that you want to assign to the cell with the
cellcontroller
profile.(undercloud)$ openstack baremetal node set --property \ capabilities='profile:cellcontroller,boot_option:local' <node_uuid>
Replace
<node_uuid>
with the ID of the Compute node that you want to assign to the cell.
Procedure: Create environment files with cell parameters
-
Create a new environment file in the cell directory cell for cell-specific parameters, for example,
/cell1/cell1.yaml
. Add the following parameters, updating the parameter values for your deployment:
resource_registry: # since the same networks are used in this example, the # creation of the different networks is omitted OS::TripleO::Network::External: OS::Heat::None OS::TripleO::Network::InternalApi: OS::Heat::None OS::TripleO::Network::Storage: OS::Heat::None OS::TripleO::Network::StorageMgmt: OS::Heat::None OS::TripleO::Network::Tenant: OS::Heat::None OS::TripleO::Network::Management: OS::Heat::None OS::TripleO::Network::Ports::OVNDBsVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml OS::TripleO::Network::Ports::RedisVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml parameter_defaults: # CELL Parameter to reflect that this is an additional CELL NovaAdditionalCell: True # mapping of the CellController flavor to the CellController role CellControllerFlavor: cellcontroller # The DNS names for the VIPs for the cell CloudName: cell1.ooo.test CloudNameInternal: cell1.internalapi.ooo.test CloudNameStorage: cell1.storage.ooo.test CloudNameStorageManagement: cell1.storagemgmt.ooo.test CloudNameCtlplane: cell1.ctlplane.ooo.test # Flavors used for the cell controller and computes OvercloudCellControllerFlavor: cellcontroller OvercloudComputeFlavor: compute # Number of controllers/computes in the cell CellControllerCount: 1 ComputeCount: 1 # Compute node name (must be unique across all cells) ComputeHostnameFormat: 'cell1-compute-%index%' # default gateway ControlPlaneStaticRoutes: - ip_netmask: 0.0.0.0/0 next_hop: 192.168.24.1 default: true DnsServers: - x.x.x.x
Optional: To allocate a network resource to the cell and register cells to the network, add the following parameter to your environment file:
resource_registry: OS::TripleO::CellController::Net::SoftwareConfig: single-nic-vlans/controller.yaml OS::TripleO::Compute::Net::SoftwareConfig: single-nic-vlans/compute.yaml
Optional: If you divide your network between the global Controller and the cells and want to run the Compute metadata API in each cell instead of in the global Controller, add the following parameter:
parameter_defaults: NovaLocalMetadataPerCell: True
Note- The parameters in this file restrict the overcloud to use a single network for all cells.
- The Compute host names must be unique across all cells.
Make a copy of the
network_data.yaml
file and rename the copy to include the cell name, for example:(undercloud)$ cp /usr/share/openstack-tripleo-heat-templates/network_data.yaml cell1/network_data-ctrl.yaml
Retrieve the UUIDs of the network components that you want to reuse for the cells:
(undercloud)$ openstack network show <network> (undercloud)$ openstack subnet show <subnet> (undercloud)$ openstack network segment show <network_segment> (undercloud)$ openstack port show <port>
-
Replace
<network>
with the name of the network whose UUID you want to retrieve. -
Replace
<subnet>
with the name of the subnet whose UUID you want to retrieve. -
Replace
<network_segment>
with the name of the segment whose UUID you want to retrieve. -
Replace
<port>
with the name of the port whose UUID you want to retrieve.
-
Replace
To reuse the network components for the cells, add the following configuration to your new cell
network_data.yaml
file:external_resource_network_id: <network_uuid> external_resource_subnet_id: <subnet_uuid> external_resource_segment_id: <segment_uuid> external_resource_vip_id: <port_uuid>
-
Replace
<network_uuid>
with the UUID of the network you retrieved in the previous step. -
Replace
<subnet_uuid>
with the UUID of the subnet you retrieved in the previous step. -
Replace
<segment_uuid>
with the UUID of the segment you retrieved in the previous step. -
Replace
<port_uuid>
with the UUID of the port you retrieved in the previous step.
-
Replace
Optional: To configure segmented networks for the global Controller cell and the Compute cells, create an environment file and add the routing information and virtual IP address (VIP) information for the cell. For example, you can add the following configuration for
cell1
to the filecell1_routes.yaml
:parameter_defaults: InternalApiInterfaceRoutes: - destination: 172.17.2.0/24 nexthop: 172.16.2.254 StorageInterfaceRoutes: - destination: 172.17.1.0/24 nexthop: 172.16.1.254 StorageMgmtInterfaceRoutes: - destination: 172.17.3.0/24 nexthop: 172.16.3.254 parameter_defaults: VipSubnetMap: InternalApi: internal_api_cell1 Storage: storage_cell1 StorageMgmt: storage_mgmt_cell1 External: external_cell1
Add the environment files to the stack with your other environment files and deploy the overcloud, for example:
(undercloud)$ openstack overcloud deploy --templates \ --stack cell1 \ -e [your environment files] \ -r $HOME/$DIR/cell_roles_data.yaml \ -e $HOME/$DIR/cell1_routes.yaml \ -e $HOME/$DIR/network_data-ctrl.yaml \ -e $HOME/$DIR/cell1-ctrl-input.yaml \ -e $HOME/$DIR/cell1.yaml
NoteIf you deploy Compute cells in Edge sites, enter the
overcloud deploy
command in each site with the environment files and configuration for each Compute cell in that site.
(Optional) Configure networking for Edge sites
To distribute Compute nodes across Edge sites, create one environment file for the main Controller cell and separate environment files for each Compute cell in that Edge site.
-
In the primary environment file, set the
ComputeCount
parameter to0
in the Controller cell. This cell is separate from the Edge site Compute cells, which will contain the actual Compute nodes. In the Compute cell environment files, add the following parameter to disable external VIP ports:
resource_registry: # Since the compute stack deploys only compute nodes ExternalVIPPorts are not required. OS::TripleO::Network::Ports::ExternalVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/noop.yaml
11.6. Creating and provisioning a cell
After you deploy the overcloud with a new cell stack, you create and provision the Compute cells.
You must repeat this process for each cell that you create and launch. You can automate the steps in an Ansible playbook. For an example of an Ansible playbook, see the Create the cell and discover Compute nodes section of the OpenStack community documentation. Community documentation is provided as-is and is not officially supported.
Procedure
Get the IP addresses of the control plane and cell controller.
$ CTRL_IP=$(openstack server list -f value -c Networks --name overcloud-controller-0 | sed 's/ctlplane=//') $ CELL_CTRL_IP=$(openstack server list -f value -c Networks --name cellcontroller-0 | sed 's/ctlplane=//')
Add the cell information to all Controller nodes. This information is used to connect to the cell endpoint from the undercloud.
(undercloud)$ CELL_INTERNALAPI_INFO=$(ssh heat-admin@${CELL_CTRL_IP} egrep \ cellcontrol.*\.internalapi /etc/hosts) (undercloud)$ ansible -i /usr/bin/tripleo-ansible-inventory Controller -b \ -m lineinfile -a "dest=/etc/hosts line=\"$CELL_INTERNALAPI_INFO\""
Get the message queue endpoint for the controller cell from the
transport_url
parameter, and the database connection for the controller cell from thedatabase.connection
parameter:(undercloud)$ CELL_TRANSPORT_URL=$(ssh heat-admin@${CELL_CTRL_IP} sudo \ crudini --get /var/lib/config-data/nova/etc/nova/nova.conf DEFAULT transport_url) (undercloud)$ CELL_MYSQL_VIP=$(ssh heat-admin@${CELL_CTRL_IP} sudo \ crudini --get /var/lib/config-data/nova/etc/nova/nova.conf database connection \ | perl -nle'/(\d+\.\d+\.\d+\.\d+)/ && print $1')
Log in to one of the global Controller nodes to create the cell:
$ export CONTAINERCLI='podman' $ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} \ exec -i -u root nova_api \ nova-manage cell_v2 create_cell --name computecell1 \ --database_connection "{scheme}://{username}:{password}@$CELL_MYSQL_VIP/nova?{query}" \ --transport-url "$CELL_TRANSPORT_URL"
Check that the cell is created and appears in the cell list.
$ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} \ exec -i -u root nova_api \ nova-manage cell_v2 list_cells --verbose
Restart the Compute services on the Controller nodes.
$ ansible -i /usr/bin/tripleo-ansible-inventory Controller -b -a \ "systemctl restart tripleo_nova_api tripleo_nova_conductor tripleo_nova_scheduler"
Check that the cell controller services are provisioned.
(overcloud)$ nova service-list
11.7. Adding Compute nodes to a cell
Procedure
- Log into one of the Controller nodes.
Get the IP address of the control plane for the cell and enter the host discovery command to expose and assign Compute hosts to the cell.
$ CTRL=overcloud-controller-0 $ CTRL_IP=$(openstack server list -f value -c Networks --name $CTRL | sed 's/ctlplane=//') $ export CONTAINERCLI='podman' $ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} exec -i -u root nova_api \ nova-manage cell_v2 discover_hosts --by-service --verbose
Verify that the Compute hosts were assigned to the cell.
$ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} exec -i -u root nova_api \ nova-manage cell_v2 list_hosts
11.8. Configuring the cell availability zone
You must assign each cell to an availability zone to ensure that instances created on the Compute nodes in that cell are only migrated to other Compute nodes in the same cell. Migrating instances between cells is not supported.
The Controller cell must be in a different availability zone from the Compute cells.
You can use host aggregates to configure the availability zone for the Compute cell. The following example shows the command to create a host aggregate for the cell cell1
, define the availability zone for the host aggregate, and add the Compute nodes within the cell to the availability zone:
(undercloud)$ source ~/overcloudrc (overcloud)$ openstack aggregate create --zone cell1 cell1 (overcloud)$ openstack aggregate add host cell1 hostA (overcloud)$ openstack aggregate add host cell1 hostB
-
You cannot use the
OS::TripleO::Services::NovaAZConfig
parameter to automatically create the AZ during deployment, because the cell is not created at this stage. - Migrating instances between cells is not supported. To move an instance to a different cell, you must delete it from the old cell and re-create it in the new cell.
For more information on host aggregates and availability zones, see Creating and managing host aggregates.
11.9. Deleting a Compute node from a cell
To delete a Compute node from a cell, you must delete all instances from the cell and delete the host names from the Placement database.
Procedure
Delete all instances from the Compute nodes in the cell.
NoteMigrating instances between cells is not supported. You must delete the instances and re-create them in another cell.
On one of the global Controllers, delete all Compute nodes from the cell.
$ CTRL=overcloud-controller-0 $ CTRL_IP=$(openstack server list -f value -c Networks --name $CTRL | sed 's/ctlplane=//') $ export CONTAINERCLI='podman' $ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} \ exec -i -u root nova_api \ nova-manage cell_v2 list_hosts $ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} \ exec -i -u root nova_api \ nova-manage cell_v2 delete_host --cell_uuid <uuid> --host <compute>
Delete the resource providers for the cell from the Placement service, to ensure that the host name is available in case you want to add Compute nodes with the same host name to another cell later. For example:
(undercloud)$ source ~/overcloudrc (overcloud)$ openstack resource provider list +--------------------------------------+---------------------------------------+------------+ | uuid | name | generation | +--------------------------------------+---------------------------------------+------------+ | 9cd04a8b-5e6c-428e-a643-397c9bebcc16 | computecell1-novacompute-0.site1.test | 11 | +--------------------------------------+---------------------------------------+------------+ (overcloud)$ openstack resource provider \ delete 9cd04a8b-5e6c-428e-a643-397c9bebcc16
11.10. Deleting a cell
To delete a cell, you must first delete all instances and Compute nodes from the cell, as described in Deleting a Compute node from a cell. Then, you delete the cell itself and the cell stack.
Procedure
On one of the global Controllers, delete the cell.
$ CTRL=overcloud-controller-0 $ CTRL_IP=$(openstack server list -f value -c Networks --name $CTRL | sed 's/ctlplane=//') $ export CONTAINERCLI='podman' $ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} \ exec -i -u root nova_api \ nova-manage cell_v2 list_cells $ ssh heat-admin@${CTRL_IP} sudo ${CONTAINERCLI} \ exec -i -u root nova_api \ nova-manage cell_v2 delete_cell --cell_uuid <uuid>
Delete the cell stack from the overcloud.
$ openstack stack delete <stack name> --wait --yes && openstack \ overcloud plan delete <STACK_NAME>
NoteIf you deployed separate cell stacks for a Controller and Compute cell, delete the Compute cell stack first and then the Controller cell stack.
11.11. Template URLs in cell mappings
You can create templates for the --database_connection
and --transport-url
in cell mappings with variables that are dynamically updated each time you query the global database. The values are taken from the configuration files of the Compute nodes.
The format of a template URL is as follows:
{scheme}://{username}:{password}@{hostname}/{path}
The following table shows the variables that you can use in cell mapping URLs:
Variable | Description |
---|---|
scheme |
Prefix before |
username | User name |
password | Password |
hostname | Host name or IP address |
port | Port number (must be specified) |
path | Path to the directory in the host (without leading slash) |
query | Full query with string arguments (without leading question mark) |
fragment |
Path after the first hash |
Chapter 12. Managing instances
As a cloud administrator, you can monitor and manage the instances running on your cloud, and you can pass metadata to each instance as it is launched.
12.1. Database cleaning
The Compute service includes an administrative tool, nova-manage
, that you can use to perform deployment, upgrade, clean-up, and maintenance-related tasks, such as applying database schemas, performing online data migrations during an upgrade, and managing and cleaning up the database.
Director automates the following database management tasks on the overcloud by using cron:
- Archives deleted instance records by moving the deleted rows from the production tables to shadow tables.
- Purges deleted rows from the shadow tables after archiving is complete.
12.1.1. Configuring database management
The cron jobs use default settings to perform database management tasks. By default, the database archiving cron jobs run daily at 00:01, and the database purging cron jobs run daily at 05:00, both with a jitter between 0 and 3600 seconds. You can modify these settings as required by using heat parameters.
Procedure
- Open your Compute environment file.
Add the heat parameter that controls the cron job that you want to add or modify. For example, to purge the shadow tables immediately after they are archived, set the following parameter to "True":
parameter_defaults: ... NovaCronArchiveDeleteRowsPurge: True
For a complete list of the heat parameters to manage database cron jobs, see Configuration options for the Compute service automated database management.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
12.1.2. Configuration options for the Compute service automated database management
Use the following heat parameters to enable and modify the automated cron jobs that manage the database.
Table 12.1. Compute (nova) service cron parameters
Parameter | Description |
---|---|
| Set this parameter to "True" to archive deleted instance records from all cells.
Default: |
| Use this parameter to archive deleted instance records based on their age in days.
Set to
Default: |
| Use this parameter to configure the file for logging deleted instance records.
Default: |
| Use this parameter to configure the hour at which to run the cron command to move deleted instance records to another table.
Default: |
| Use this parameter to configure the maximum delay, in seconds, before moving deleted instance records to another table.
Default: |
| Use this parameter to configure the maximum number of deleted instance records that can be moved to another table.
Default: |
| Use this parameter to configure the minute past the hour at which to run the cron command to move deleted instance records to another table.
Default: |
| Use this parameter to configure on which day of the month to run the cron command to move deleted instance records to another table.
Default: |
| Use this parameter to configure in which month to run the cron command to move deleted instance records to another table.
Default: |
| Set this parameter to "True" to purge shadow tables immediately after scheduled archiving.
Default: |
| Set this parameter to "True" to continue to move deleted instance records to another table until all records are moved.
Default: |
| Use this parameter to configure the user that owns the crontab that archives deleted instance records and that has access to the log file the crontab uses.
Default: |
| Use this parameter to configure on which day of the week to run the cron command to move deleted instance records to another table.
Default: |
| Use this parameter to purge shadow tables based on their age in days.
Set to
Default: |
| Set this parameter to "True" to purge shadow tables from all cells.
Default: |
| Use this parameter to configure the file for logging purged shadow tables.
Default: |
| Use this parameter to configure the hour at which to run the cron command to purge shadow tables.
Default: |
| Use this parameter to configure the maximum delay, in seconds, before purging shadow tables.
Default: |
| Use this parameter to configure the minute past the hour at which to run the cron command to purge shadow tables.
Default: |
| Use this parameter to configure in which month to run the cron command to purge the shadow tables.
Default: |
| Use this parameter to configure on which day of the month to run the cron command to purge the shadow tables.
Default: |
| Use this parameter to configure the user that owns the crontab that purges the shadow tables and that has access to the log file the crontab uses.
Default: |
| Use this parameter to enable verbose logging in the log file for purged shadow tables.
Default: |
| Use this parameter to configure on which day of the week to run the cron command to purge the shadow tables.
Default: |
12.2. Migrating virtual machine instances between Compute nodes
You sometimes need to migrate instances from one Compute node to another Compute node in the overcloud, to perform maintenance, rebalance the workload, or replace a failed or failing node.
- Compute node maintenance
- If you need to temporarily take a Compute node out of service, for instance, to perform hardware maintenance or repair, kernel upgrades and software updates, you can migrate instances running on the Compute node to another Compute node.
- Failing Compute node
- If a Compute node is about to fail and you need to service it or replace it, you can migrate instances from the failing Compute node to a healthy Compute node.
- Failed Compute nodes
- If a Compute node has already failed, you can evacuate the instances. You can rebuild instances from the original image on another Compute node, using the same name, UUID, network addresses, and any other allocated resources the instance had before the Compute node failed.
- Workload rebalancing
- You can migrate one or more instances to another Compute node to rebalance the workload. For example, you can consolidate instances on a Compute node to conserve power, migrate instances to a Compute node that is physically closer to other networked resources to reduce latency, or distribute instances across Compute nodes to avoid hot spots and increase resiliency.
Director configures all Compute nodes to provide secure migration. All Compute nodes also require a shared SSH key to provide the users of each host with access to other Compute nodes during the migration process. Director creates this key using the OS::TripleO::Services::NovaCompute
composable service. This composable service is one of the main services included on all Compute roles by default. For more information, see Composable Services and Custom Roles in the Advanced Overcloud Customization guide.
If you have a functioning Compute node, and you want to make a copy of an instance for backup purposes, or to copy the instance to a different environment, follow the procedure in Importing virtual machines into the overcloud in the Director Installation and Usage guide.
12.2.1. Migration types
Red Hat OpenStack Platform (RHOSP) supports the following types of migration.
Cold migration
Cold migration, or non-live migration, involves shutting down a running instance before migrating it from the source Compute node to the destination Compute node.

Cold migration involves some downtime for the instance. The migrated instance maintains access to the same volumes and IP addresses.
Cold migration requires that both the source and destination Compute nodes are running.
Live migration
Live migration involves moving the instance from the source Compute node to the destination Compute node without shutting it down, and while maintaining state consistency.

Live migrating an instance involves little or no perceptible downtime. However, live migration does impact performance for the duration of the migration operation. Therefore, instances should be taken out of the critical path while being migrated.
Live migration requires that both the source and destination Compute nodes are running.
In some cases, instances cannot use live migration. For more information, see Migration constraints.
Evacuation
If you need to migrate instances because the source Compute node has already failed, you can evacuate the instances.
12.2.2. Migration constraints
Migration constraints typically arise with block migration, configuration disks, or when one or more instances access physical hardware on the Compute node.
CPU constraints
The source and destination Compute nodes must have the same CPU architecture. For example, Red Hat does not support migrating an instance from an x86_64
CPU to a ppc64le
CPU. In some cases, the CPU of the source and destination Compute node must match exactly, such as instances that use CPU host passthrough. In all cases, the CPU features of the destination node must be a superset of the CPU features on the source node.
Memory constraints
The destination Compute node must have sufficient available RAM. Memory oversubscription can cause migration to fail.
Block migration constraints
Migrating instances that use disks that are stored locally on a Compute node takes significantly longer than migrating volume-backed instances that use shared storage, such as Red Hat Ceph Storage. This latency arises because OpenStack Compute (nova) migrates local disks block-by-block between the Compute nodes over the control plane network by default. By contrast, volume-backed instances that use shared storage, such as Red Hat Ceph Storage, do not have to migrate the volumes, because each Compute node already has access to the shared storage.
Network congestion in the control plane network caused by migrating local disks or instances that consume large amounts of RAM might impact the performance of other systems that use the control plane network, such as RabbitMQ.
Read-only drive migration constraints
Migrating a drive is supported only if the drive has both read and write capabilities. For example, OpenStack Compute (nova) cannot migrate a CD-ROM drive or a read-only config drive. However, OpenStack Compute (nova) can migrate a drive with both read and write capabilities, including a config drive with a drive format such as vfat
.
Live migration constraints
In some cases, live migrating instances involves additional constraints.
- No new operations during migration
- To achieve state consistency between the copies of the instance on the source and destination nodes, RHOSP must prevent new operations during live migration. Otherwise, live migration might take a long time or potentially never end if writes to memory occur faster than live migration can replicate the state of the memory.
- CPU pinning with NUMA
-
NovaSchedulerDefaultFilters
parameter in the Compute configuration must include the valuesAggregateInstanceExtraSpecsFilter
andNUMATopologyFilter
. - Multi-cell clouds
- In a multi-cell cloud, instances can be live migrated to a different host in the same cell, but not across cells.
- Floating instances
-
When live migrating floating instances, if the configuration of
NovaComputeCpuSharedSet
on the destination Compute node is different from the configuration ofNovaComputeCpuSharedSet
on the source Compute node, the instances will not be allocated to the CPUs configured for shared (unpinned) instances on the destination Compute node. Therefore, if you need to live migrate floating instances, you must configure all the Compute nodes with the same CPU mappings for dedicated (pinned) and shared (unpinned) instances, or use a host aggregate for the shared instances. - Destination Compute node capacity
- The destination Compute node must have sufficient capacity to host the instance that you want to migrate.
- SR-IOV live migration
- Instances with SR-IOV-based network interfaces can be live migrated. Live migrating instances with direct mode SR-IOV network interfaces attached incurs network downtime while the direct mode interfaces are being detached and re-attached.
Constraints that preclude live migration
You cannot live migrate an instance that uses the following features.
- PCI passthrough
- QEMU/KVM hypervisors support attaching PCI devices on the Compute node to an instance. Use PCI passthrough to give an instance exclusive access to PCI devices, which appear and behave as if they are physically attached to the operating system of the instance. However, because PCI passthrough involves physical addresses, OpenStack Compute does not support live migration of instances using PCI passthrough.
- Port resource requests
You cannot live migrate an instance that uses a port that has resource requests, such as a guaranteed minimum bandwidth QoS policy. Use the following command to check if a port has resource requests:
$ openstack port show <port_name/port_id>
12.2.3. Preparing to migrate
Before you migrate one or more instances, you need to determine the Compute node names and the IDs of the instances to migrate.
Procedure
Identify the source Compute node host name and the destination Compute node host name:
(undercloud)$ source ~/overcloudrc (overcloud)$ openstack compute service list
List the instances on the source Compute node and locate the ID of the instance or instances that you want to migrate:
(overcloud)$ openstack server list --host <source> --all-projects
Replace
<source>
with the name or ID of the source Compute node.Optional: If you are migrating instances from a source Compute node to perform maintenance on the node, you must disable the node to prevent the scheduler from assigning new instances to the node during maintenance:
(overcloud)$ source ~/stackrc (undercloud)$ openstack compute service set <source> nova-compute --disable
Replace
<source>
with the name or ID of the source Compute node.
You are now ready to perform the migration. Follow the required procedure detailed in Cold migrating an instance or Live migrating an instance.
12.2.4. Cold migrating an instance
Cold migrating an instance involves stopping the instance and moving it to another Compute node. Cold migration facilitates migration scenarios that live migrating cannot facilitate, such as migrating instances that use PCI passthrough. The scheduler automatically selects the destination Compute node. For more information, see Migration constraints.
Procedure
To cold migrate an instance, enter the following command to power off and move the instance:
(overcloud)$ openstack server migrate <vm> --wait
-
Replace
<vm>
with the name or ID of the instance to migrate. -
Specify the
--block-migration
flag if migrating a locally stored volume.
-
Replace
- Wait for migration to complete. While you wait for the instance migration to complete, you can check the migration status. For more information, see Checking migration status.
Check the status of the instance:
(overcloud)$ openstack server list --all-projects
A status of "VERIFY_RESIZE" indicates you need to confirm or revert the migration:
If the migration worked as expected, confirm it:
(overcloud)$ openstack server resize --confirm <vm>`
Replace
<vm>
with the name or ID of the instance to migrate. A status of "ACTIVE" indicates that the instance is ready to use.If the migration did not work as expected, revert it:
(overcloud)$ openstack server resize --revert <vm>`
Replace
<vm>
with the name or ID of the instance.
Restart the instance:
(overcloud)$ openstack server start <vm>
Replace
<vm>
with the name or ID of the instance.Optional: If you disabled the source Compute node for maintenance, you must re-enable the node so that new instances can be assigned to it:
(overcloud)$ source ~/stackrc (undercloud)$ openstack compute service set <source> nova-compute --enable
Replace
<source>
with the host name of the source Compute node.
12.2.5. Live migrating an instance
Live migration moves an instance from a source Compute node to a destination Compute node with a minimal amount of downtime. Live migration might not be appropriate for all instances. For more information, see Migration constraints.
Procedure
To live migrate an instance, specify the instance and the destination Compute node:
(overcloud)$ openstack server migrate <vm> --live-migration [--host <dest>] --wait
-
Replace
<vm>
with the name or ID of the instance. Replace
<dest>
with the name or ID of the destination Compute node.NoteThe
openstack server migrate
command covers migrating instances with shared storage, which is the default. Specify the--block-migration
flag to migrate a locally stored volume:(overcloud)$ openstack server migrate <vm> --live-migration [--host <dest>] --wait --block-migration
-
Replace
Confirm that the instance is migrating:
(overloud)$ openstack server show <vm> +----------------------+--------------------------------------+ | Field | Value | +----------------------+--------------------------------------+ | ... | ... | | status | MIGRATING | | ... | ... | +----------------------+--------------------------------------+
- Wait for migration to complete. While you wait for the instance migration to complete, you can check the migration status. For more information, see Checking migration status.
Check the status of the instance to confirm if the migration was successful:
(overcloud)$ openstack server list --host <dest> --all-projects
Replace
<dest>
with the name or ID of the destination Compute node.Optional: If you disabled the source Compute node for maintenance, you must re-enable the node so that new instances can be assigned to it:
(overcloud)$ source ~/stackrc (undercloud)$ openstack compute service set <source> nova-compute --enable
Replace
<source>
with the host name of the source Compute node.
12.2.6. Checking migration status
Migration involves several state transitions before migration is complete. During a healthy migration, the migration state typically transitions as follows:
- Queued: The Compute service has accepted the request to migrate an instance, and migration is pending.
- Preparing: The Compute service is preparing to migrate the instance.
- Running: The Compute service is migrating the instance.
- Post-migrating: The Compute service has built the instance on the destination Compute node and is releasing resources on the source Compute node.
- Completed: The Compute service has completed migrating the instance and finished releasing resources on the source Compute node.
Procedure
Retrieve the list of migration IDs for the instance:
$ nova server-migration-list <vm> +----+-------------+----------- (...) | Id | Source Node | Dest Node | (...) +----+-------------+-----------+ (...) | 2 | - | - | (...) +----+-------------+-----------+ (...)
Replace
<vm>
with the name or ID of the instance.Show the status of the migration:
$ nova server-migration-show <vm> <migration-id>
-
Replace
<vm>
with the name or ID of the instance. Replace
<migration-id>
with the ID of the migration.Running the
nova server-migration-show
command returns the following example output:+------------------------+--------------------------------------+ | Property | Value | +------------------------+--------------------------------------+ | created_at | 2017-03-08T02:53:06.000000 | | dest_compute | controller | | dest_host | - | | dest_node | - | | disk_processed_bytes | 0 | | disk_remaining_bytes | 0 | | disk_total_bytes | 0 | | id | 2 | | memory_processed_bytes | 65502513 | | memory_remaining_bytes | 786427904 | | memory_total_bytes | 1091379200 | | server_uuid | d1df1b5a-70c4-4fed-98b7-423362f2c47c | | source_compute | compute2 | | source_node | - | | status | running | | updated_at | 2017-03-08T02:53:47.000000 | +------------------------+--------------------------------------+
TipThe OpenStack Compute service measures progress of the migration by the number of remaining memory bytes to copy. If this number does not decrease over time, the migration might be unable to complete, and the Compute service might abort it.
-
Replace
Sometimes instance migration can take a long time or encounter errors. For more information, see Troubleshooting migration.
12.2.7. Evacuating an instance
If you want to move an instance from a dead or shut-down Compute node to a new host in the same environment, you can evacuate it.
The evacuate process destroys the original instance and rebuilds it on another Compute node using the original image, instance name, UUID, network addresses, and any other resources the original instance had allocated to it.
If the instance uses shared storage, the instance root disk is not rebuilt during the evacuate process, as the disk remains accessible by the destination Compute node. If the instance does not use shared storage, then the instance root disk is also rebuilt on the destination Compute node.
-
You can only perform an evacuation when the Compute node is fenced, and the API reports that the state of the Compute node is "down" or "forced-down". If the Compute node is not reported as "down" or "forced-down", the
evacuate
command fails. - To perform an evacuation, you must be a cloud administrator.
12.2.7.1. Evacuating one instance
You can evacuate instances one at a time.
Procedure
- Log onto the failed Compute node as an administrator.
Disable the Compute node:
(overcloud)[stack@director ~]$ openstack compute service set \ <host> <service> --disable
-
Replace
<host>
with the name of the Compute node to evacuate the instance from. -
Replace
<service>
with the name of the service to disable, for examplenova-compute
.
-
Replace
To evacuate an instance, enter the following command:
(overcloud)[stack@director ~]$ nova evacuate [--password <pass>] <vm> [<dest>]
-
Replace
<pass>
with the admin password to set for the evacuated instance. If a password is not specified, a random password is generated and output when the evacuation is complete. -
Replace
<vm>
with the name or ID of the instance to evacuate. Replace
<dest>
with the name of the Compute node to evacuate the instance to. If you do not specify the destination Compute node, the Compute scheduler selects one for you. You can find possible Compute nodes by using the following command:(overcloud)[stack@director ~]$ openstack hypervisor list
-
Replace
12.2.7.2. Evacuating all instances on a host
You can evacuate all instances on a specified Compute node.
Procedure
- Log onto the failed Compute node as an administrator.
Disable the Compute node:
(overcloud)[stack@director ~]$ openstack compute service set \ <host> <service> --disable
-
Replace
<host>
with the name of the Compute node to evacuate the instances from. -
Replace
<service>
with the name of the service to disable, for examplenova-compute
.
-
Replace
Evacuate all instances on a specified Compute node:
(overcloud)[stack@director ~]$ nova host-evacuate [--target_host <dest>] [--force] <host>
Replace
<dest>
with the name of the destination Compute node to evacuate the instances to. If you do not specify the destination, the Compute scheduler selects one for you. You can find possible Compute nodes by using the following command:(overcloud)[stack@director ~]$ openstack hypervisor list
-
Replace
<host>
with the name of the Compute node to evacuate the instances from.
12.2.8. Troubleshooting migration
The following issues can arise during instance migration:
- The migration process encounters errors.
- The migration process never ends.
- Performance of the instance degrades after migration.
12.2.8.1. Errors during migration
The following issues can send the migration operation into an error
state:
- Running a cluster with different versions of Red Hat OpenStack Platform (RHOSP).
- Specifying an instance ID that cannot be found.
-
The instance you are trying to migrate is in an
error
state. - The Compute service is shutting down.
- A race condition occurs.
-
Live migration enters a
failed
state.
When live migration enters a failed
state, it is typically followed by an error
state. The following common issues can cause a failed
state:
- A destination Compute host is not available.
- A scheduler exception occurs.
- The rebuild process fails due to insufficient computing resources.
- A server group check fails.
- The instance on the source Compute node gets deleted before migration to the destination Compute node is complete.
12.2.8.2. Never-ending live migration
Live migration can fail to complete, which leaves migration in a perpetual running
state. A common reason for a live migration that never completes is that client requests to the instance running on the source Compute node create changes that occur faster than the Compute service can replicate them to the destination Compute node.
Use one of the following methods to address this situation:
- Abort the live migration.
- Force the live migration to complete.
Aborting live migration
If the instance state changes faster than the migration procedure can copy it to the destination node, and you do not want to temporarily suspend the instance operations, you can abort the live migration.
Procedure
Retrieve the list of migrations for the instance:
$ nova server-migration-list <vm>
Replace
<vm>
with the name or ID of the instance.Abort the live migration:
$ nova live-migration-abort <vm> <migration-id>
-
Replace
<vm>
with the name or ID of the instance. -
Replace
<migration-id>
with the ID of the migration.
-
Replace
Forcing live migration to complete
If the instance state changes faster than the migration procedure can copy it to the destination node, and you want to temporarily suspend the instance operations to force migration to complete, you can force the live migration procedure to complete.
Forcing live migration to complete might lead to perceptible downtime.
Procedure
Retrieve the list of migrations for the instance:
$ nova server-migration-list <vm>
Replace
<vm>
with the name or ID of the instance.Force the live migration to complete:
$ nova live-migration-force-complete <vm> <migration-id>
-
Replace
<vm>
with the name or ID of the instance. -
Replace
<migration-id>
with the ID of the migration.
-
Replace
12.2.8.3. Instance performance degrades after migration
For instances that use a NUMA topology, the source and destination Compute nodes must have the same NUMA topology and configuration. The NUMA topology of the destination Compute node must have sufficient resources available. If the NUMA configuration between the source and destination Compute nodes is not the same, it is possible that live migration succeeds while the instance performance degrades. For example, if the source Compute node maps NIC 1 to NUMA node 0, but the destination Compute node maps NIC 1 to NUMA node 5, after migration the instance might route network traffic from a first CPU across the bus to a second CPU with NUMA node 5 to route traffic to NIC 1. This can result in expected behavior, but degraded performance. Similarly, if NUMA node 0 on the source Compute node has sufficient available CPU and RAM, but NUMA node 0 on the destination Compute node already has instances using some of the resources, the instance might run correctly but suffer performance degradation. For more information, see Migration constraints.