Chapter 2. Recommended specifications for your large Red Hat OpenStack deployment
You can use the provided recommendations to scale your large cluster deployment.
The values in the following procedures are based on testing that the Red Hat OpenStack Platform Performance & Scale Team performed and can vary according to individual environments. For more information, see Scaling Red Hat OpenStack Platform 16.1 to more than 700 nodes.
2.1. Undercloud system requirements
For best performance, install the undercloud node on a physical server. However, if you use a virtualized undercloud node, ensure that the virtual machine has enough resources similar to a physical machine described in the following table.
Table 2.1. Recommended specifications for the undercloud node
System requirement | Description |
---|---|
Counts | 1 |
CPUs | 32 cores, 64 threads |
Disk | 500 GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1) 500 GB disk for Object Storage (swift) (1x SSD or 2x hard drives with 7200RPM; RAID 1) |
Memory | 256 GB |
Network | 25 Gbps network interfaces or 10 Gbps network interfaces |
2.2. Overcloud Controller nodes system requirements
All control plane services must run on exactly 3 nodes. Typically, all control plane services are deployed across 3 Controller nodes.
Scaling controller services
To increase the resources available for controller services, you can scale these services to additional nodes. For example, you can deploy the db
or messaging
controller services on dedicated nodes to reduce the load on the Controller nodes.
To scale controller services, use composable roles to define the set of services that you want to scale. When you use composable roles, each service must run on exactly 3 additional dedicated nodes and the total number of nodes in the control plane must be odd to maintain Pacemaker quorum.
The control plane in this example consists of the following 9 nodes:
- 3 Controller nodes
- 3 Database nodes
- 3 Messaging nodes
For more information, see Composable services and custom roles in Advanced Overcloud Customization.
For questions about scaling controller services with composable roles, contact Red Hat Global Professional Services.
Storage considerations
Include sufficient storage when you plan Controller nodes in your overcloud deployment. OpenStack Telemetry Metrics (gnocchi) and OpenStack Image service (glance) services are I/O intensive. Use Ceph Storage and the Image service for telemetry because the overcloud moves the I/O load to the Ceph OSD servers.
If your deployment does not include Ceph storage, use a dedicated disk or node for Object Storage (swift) that Telemetry Metrics (gnocchi) and Image (glance) services can use. If you use Object Storage on Controller nodes, use an NVMe device separate from the root disk to reduce disk utilization during object data storage.
Extensive concurrent operations to the Block Storage service (cinder) that upload volumes to the Image Storage service (glance) as images puts considerable IO load on the controller disk. You can use SSD disks to provide a higher throughput.
CPU considerations
The number of API calls, AMQP messages, and database queries that the Controller nodes receive influences the CPU memory consumption on the Controller nodes. The ability of each Red Hat OpenStack Platform (RHOSP) component to concurrently process and perform tasks is also limited by the number of worker threads that are configured for each of the individual RHOSP components. The number of worker threads for components that RHOSP director configures on a Controller is limited by the CPU count.
The following specifications are recommended for large scale environments with more than 700 nodes when you use Ceph Storage nodes in your deployment:
Table 2.2. Recommended specifications for Controller nodes when you use Ceph Storage nodes
System requirement | Description |
---|---|
Counts | 3 Controller nodes with controller services contained within the Controller role. Optionally, to scale controller services on dedicated nodes, use composable services. For more information, see Composable services and customer roles in Advanced Overcloud Customization. |
CPUs | 2 sockets each with 32 cores, 64 threads |
Disk | 500 GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1) 500GB dedicated disk for Swift (1x SSD or 1x NVMe) |
Memory | 384 GB |
Network | 25 Gbps network interfaces or 10 Gbps network interfaces. If you use 10 Gbps network interfaces, use network bonding to create two bonds:
|
The following specifications are recommended for large scale environments with more than 700 nodes when you do not use Ceph Storage nodes in your deployment:
Table 2.3. Recommended specifications for Controller nodes when you do not use Ceph Storage nodes
System requirement | Description |
---|---|
Counts | 3 Controller nodes with controller services contained within the Controller role. Optionally, to scale controller services on dedicated nodes, use composable services. For more information, see Composable services and customer roles in Advanced Overcloud Customization. |
CPUs | 2 sockets each with 32 cores, 64 threads |
Disk | 500GB root disk (1x SSD ) 500GB dedicated disk for Swift (1x SSD or 1x NVMe) |
Memory | 384 GB |
Network | 25 Gbps network interfaces or 10 Gbps network interfaces. If you use 10 Gbps network interfaces, use network bonding to create two bonds:
|
2.3. Overcloud Compute nodes system requirements
When you plan your overcloud deployment, review the recommended system requirements for Compute nodes.
Table 2.4. Recommended specifications for Compute nodes
System requirement | Description |
---|---|
Counts | Red Hat has tested a scale of 700 nodes with various composable compute roles. |
CPUs | 2 sockets each with 12 cores, 24 threads |
Disk | 500 GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1) 500 GB disk for Image service (glance) image cache (1x SSD or 2x hard drives with 7200RPM; RAID 1) |
Memory | 128 GB (64 GB per NUMA node); 2 GB is reserved for the host out by default. With Distributed Virtual Routing, increase the reserved RAM to 5 GB. |
Network | 25 Gbps network interfaces or 10 Gbps network interfaces. If you use 10 Gbps network interfaces, use network bonding to create two bonds:
|
2.4. Red Hat Ceph Storage nodes system requirements
When you plan your overcloud deployment, review the following recommended system requirements for Ceph storage nodes.
Table 2.5. Recommended specifications for Ceph Storage nodes
System requirement | Description |
---|---|
Counts | You must have a minimum of 5 nodes with three-way replication. With all-flash configuration, you must have a minimum of 3 nodes with two-way replication. |
CPUs | 1 Intel Broadwell CPU core per OSD to support storage I/O requirements. If you are using a light I/O workload, you might not need Ceph to run at the speed of your block devices. For example, for some NFV applications, Ceph supplies data durability, high availability, and low latency but throughput is not a target, so it is acceptable to supply less CPU power. |
Memory | Ensure that you have 5 GB RAM per OSD. This is required for caching OSD data and metadata to optimize performance, not just for the OSD process memory. For hyper-converged infrastructure (HCI) environments, calculate the required memory in conjunction with the Compute node specifications. |
Network | Ensure that the network capacity in MB/s is higher than the total MB/s capacity of the Ceph devices to support workloads that use a large I/O transfer size. Use a cluster network to lower write latency by shifting inter-OSD traffic onto a separate set of physical network ports. To do this in Red Hat OpenStack Platform, configure separate VLANs for networks and assign the VLANs to separate physical network interfaces. |
Disk |
Use Solid-State Drive (SSD) disks for the bluestore |
For more information about hardware prerequisites for Ceph nodes, see General principles for selecting hardware in the Red Hat Storage 4 Hardware Guide.
For more information about deployment configuration for Ceph nodes, see Deploying an overcloud with containerized Red Hat Ceph.
For more information about changing the storage replication number, see Pools, placement groups, and CRUSH Configuration Reference in the Red Hat Ceph Storage Configuration Guide.