Chapter 3. Red Hat OpenStack deployment best practices
Review the following best practices when you plan and prepare to deploy OpenStack. You can apply one or more of these practices in your environment.
3.1. Red Hat OpenStack deployment preparation
Before you deploy Red Hat OpenStack Platform (RHOSP), review the following list of deployment preparation tasks. You can apply one or more of the deployment preparation tasks in your environment:
- Set a subnet range for introspection to accommodate the maximum overcloud nodes for which you want to perform introspection at a time
- When you use director to deploy and configure RHOSP, use CIDR notations for the control plane network to accommodate all overcloud nodes that you add now or in the future.
- Set the root password on your overcloud image to allow console access to the overcloud image
Use the console to troubleshoot failed deployments when networking is set incorrectly. For more information, see Installing virt-customize to the director and Setting the Root Password in the Partner Integration Guide. Adhere to the information security policies of your organization for password management when you implement this recommendation.
Alternatively, you can use the
userdata_root_password.yamltemplate to configure the root password by using the
NodeUserDataparameter. You can find the template in
The following example uses the template to configure the
resource_registry: OS::TripleO::NodeUserData: /usr/share/openstack-tripleo-heat-templates/firstboot/userdata_root_password.yaml parameter_defaults: NodeRootPassword: '<password>'
- Use scheduler hints to assign hardware to a role
Use scheduler hints to assign hardware to a role, such as
CephStorage, and others. Scheduler hints provide easier identification of deployment issues that affect only a specific piece of hardware.
nova-scheduler, which is a single process, can overexert when it schedules a large number of nodes. Scheduler hints reduce the load on
nova-schedulerwhen scheduler hints implement tag matching. As a result,
nova-schedulerencounters fewer scheduling errors during the deployment and the deployment takes less time when you use scheduler hints.
- Do not use profile tagging when you use scheduler hints.
- In performance testing, use identical hardware for specific roles to reduce variability in testing and performance results.
- For more information, see Assigning Specific Node IDs in the Advanced Overcloud Customization Guide.
- Use scheduler hints to assign hardware to a role, such as
- Set the World Wide Name (WWN) as the root disk hint for each node to prevent nodes from using the wrong disk during deployment and booting
- When nodes contain multiple disks, use the introspection data to set the WWN as the root disk hint for each node. This prevents the node from using the wrong disk during deployment and booting. For more information, see Defining the Root Disk for Nodes in the Director Installation and Usage guide.
- Enable the Bare Metal service (ironic) automated cleaning on nodes that have more than one disk
Use the Bare Metal service automated cleaning to erase metadata on nodes that have more than one disk and are likely to have multiple boot loaders. Nodes might become inconsistent with the boot disk due to the presence of multiple bootloaders on disks, which leads to node deployment failure when you attempt to pull the metadata that uses the wrong URL.
To enable the Bare Metal service automated cleaning, on the undercloud node, edit the
undercloud.conffile and add the following line:
clean_nodes = true
- Limit the number of nodes for Bare Metal (ironic) introspection
If you perform introspection on all nodes at the same time, failures might occur due to network constraints. Perform introspection on up to 50 nodes at a time.
Ensure that the
dhcp_endrange in the
undercloud.conffile is large enough for the number of nodes that you expect to have in the environment.
If there are insufficient available IPs, do not issue more than the size of the range. This limits the number of simultaneous introspection operations. To allow the introspection DHCP leases to expire, do not issue more IP addresses for a few minutes after the introspection completes.
- Prepare Ceph for different types of configurations
The following list is a set of recommendations for different types of configurations:
All-flash OSD configuration
Each OSD requires additional CPUs according to the IOPS capacity of the device type, so Ceph IOPS are CPU-limited at a lower number of OSDs. This is true for NVM SSDs, which can have two orders of magnitude higher IOPS capacity than traditional HDDs. For SATA/SAS SSDs, expect one order of magnitude greater random IOPS/OSD than HDDs, but only about two to four times the sequential IOPS increase. You can supply less CPU resources to Ceph than Ceph needs for OSD devices.
Hyper Converged Infrastructure (HCI)
It is recommended to reserve at least half of your CPU capacity, memory, and network for the OpenStack Compute (nova) guests. Ensure that you have enough CPU capacity and memory to support both OpenStack Compute (nova) guests and Ceph Storage. Observe memory consumption because Ceph Storage memory consumption is not elastic. On a multi-CPUs socket system, limit Ceph CPU consumption with NUMA-pinning Ceph to a single socket. For example, use the
numactl -N 0 -p 0command. Do not hard-pin Ceph memory consumption to 1 socket.
Latency-sensitive applications such as NFV
Place Ceph on the same CPU socket as the network card that Ceph uses and limit the network card interruptions to that CPU socket if possible, with a network application that runs on a different NUMA socket and network card.
If you use dual bootloaders, use disk-by-path for the OSD map. This gives the user consistent deployments, unlike using the device name. The following snippet is an example of the
CephAnsibleDisksConfigfor a disk-by-path mapping.
CephAnsibleDisksConfig: osd_scenario: non-collocated devices: - /dev/disk/by-path/pci-0000:03:00.0-scsi-0:2:0:0 - /dev/disk/by-path/pci-0000:03:00.0-scsi-0:2:1:0 dedicated_devices: - /dev/nvme0n1 - /dev/nvme0n1 journal_size: 512
3.2. Red Hat OpenStack deployment configuration
Review the following list of recommendations for your Red Hat OpenStack Platform(RHOSP) deployment configuration:
- Validate the heat templates with a small scale deployment
- Deploy a small environment that consists of at least three Controllers, one Compute note, and three Ceph Storage nodes. You can use this configuration to ensure that all of your heat templates are correct.
- Disable telemetry notifications on the undercloud
You can disable telemetry notifications on the undercloud for the following OpenStack services to decrease the RabbitMQ queue:
- Compute (nova)
- Networking (neutron)
- Orchestration (heat)
- Identity (keystone)
To disable the notifications, in the
/usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yamltemplate, set the notification driver setting to
- Limit the number of nodes that are provisioned at the same time
Fifty is the typical amount of servers that can fit within a average enterprise-level rack unit, therefore, you can deploy an average of one rack of nodes at one time.
To minimize the debugging necessary to diagnose issues with the deployment, deploy no more than 50 nodes at one time. However, if you want to deploy a higher number of nodes, Red Hat has successfully tested up to 100 nodes simultaneously.
To scale Compute nodes in batches, use the
openstack overcloud deploycommand with the
--limitoption. This can result in saved time and lower resource consumption on the undercloud.
- Disable unused NICs
If the overcloud has any unused NICs during the deployment, you must define the unused interfaces in the NIC configuration templates and set the interfaces to
If you do not define unused interfaces, there might be routing issues and IP allocation problems during introspection and scaling operations. By default, the NICs set
BOOTPROTO=dhcp, which means the unused overcloud NICs consume IP addresses that are needed for the PXE provisioning. This can reduce the pool of available IP addresses for your nodes.
- Power off unused Bare Metal Provisioning (ironic) nodes
- Ensure that you power off any unused Bare Metal Provisioning (ironic) nodes in maintenance mode. Red Hat has identified cases where nodes from previous deployments are left in maintenance mode in a powered on state. This can occur with Bare Metal automated cleaning, where a node that fails cleaning is set to maintenance mode. Bare Metal Provisioning does not track the power state of nodes in maintenance mode and incorrectly reports the power state as off. This can cause problems with ongoing deployments. When you redeploy after a failed deployment, ensure that you power off all unused nodes that use the power management device of the node.
3.3. Tuning the undercloud
Review this section when you plan to scale your RHOSP deployment and apply tuning to your default undercloud settings.
- If you use the Telemetry service (ceilometer), improve the performance of the service
Because the Telemetry service is CPU-intensive, telemetry is not enabled by default in RHOSP 16.2. If you use want to use Telemetry, you can improve the performance of the service.
For more information, see Telemetry in the Deployment Recommendations for Specific Red Hat OpenStack Platform Services Guide.
- Separate the provisioning and configuration processes
To create only the stack and associated RHOSP resources, you can run the deployment command with the
--stack-onlyoption. Include any environment files that are required for your overcloud:
$ openstack overcloud deploy \ --templates \ -e <environment-file1.yaml> \ -e <environment-file2.yaml> \ ... --stack-only
After you have provisioned the stack, you can enable the SSH access for the
tripleo-adminuser from the undercloud to the overcloud. The
config-downloadprocess uses the
tripleo-adminuser to perform the Ansible based configuration:
$ openstack overcloud admin authorize
To disable the overcloud stack creation and run only the
config-downloadworkflow to apply the software configuration, you can run the deployment command with the
--config-download-only option. Include any environment files that are required for your overcloud:
$ openstack overcloud deploy \ --templates \ -e <environment-file1.yaml> \ -e <environment-file2.yaml> \ ... --config-download-only
To limit the
config-downloadplaybook execution to a specific node or set of nodes, you can use the
--limitoption. For scale up operations, when you want to apply software configuration on the new nodes only, use the
--limitoption with the
$ openstack overcloud deploy \ --templates \ -e <environment-file1.yaml> \ -e <environment-file2.yaml> \ ... --config-download-only --config-download-timeout --limit <Undercloud>,<Controller>,<Compute-1>,<Compute-2>
- To create only the stack and associated RHOSP resources, you can run the deployment command with the
3.4. Tuning the overcloud
Review the following section when you plan to scale your Red Hat OpenStack Platform (RHOSP) deployment and apply tuning to your default overcloud settings:
- Increase OVN OVSDB client probe intervals to prevent failover
Increase OVSDB client probe intervals for large RHOSP deployments. Pacemaker triggers a failover of the
ovn-dbs-bundlewhen it does not get a response from OVN within the configured timeout. To increase the OVN OVSDB client probe intervals to 360 seconds, edit the
OVNDBSPacemakerTimeoutparameter in your heat templates:
On each Compute and Controller node, the OVN controller periodically probes the OVN SBDB and if these requests timeout, the OVN controller resynchronizes. When multiple Compute and Controller nodes are loaded with requests to create resources, the default 60 seconds timeout values are not sufficient. To increase the OVN SBDB client probe intervals to 180 seconds, edit the
OVNOpenflowProbeIntervalparameter in your heat templates:
ControllerParameters: OVNRemoteProbeInterval: 180000 OVNOpenflowProbeInterval: 180Note
During RHOSP user and service triggered operations, due to resource constraints, such as CPU or memory resource constraints, multiple components can reach their configured timeout values. This can result in timeout request failures to the haproxy front end or back end, messaging timeout, db query-related failures, cluster instability, and so on. Benchmark your overcloud environment after initial deployment to help identify timeout-related bottlenecks.