Chapter 3. Region and Zones

When planning a large CloudForms implementation, consideration must be given to the number and size of regions required, and the layout of zones within those regions.[6]. Figure 3.1, “Regions and Zones” shows an example of multiple regions working together in a Red Hat CloudForms environment.

Figure 3.1. Regions and Zones

Screenshot


In this example there are two geographical "subordinate" regions containing providers (US West and US East), and one master region. Each of the subordinate regions has its own VMDB database, managed in its own dedicated VMDB zone. The two subordinate region VMDBs are replicated to the master region VMDB.

Each region has a dedicated WebUI zone containing two CFME appliances (load-balanced), that users local to the region connect to for interactive management. The two subordinate regions each have one or more provider-specific zones, containing the CFME appliances that manage the workload for their respective OpenShift Container Platform providers.

This section describes some of the considerations when designing regions and zones, and presents some guidelines and suggestions for implementation.

3.1. Regions

A region is a self-contained collection of CloudForms Management Engine (CFME) appliances. Each region has a database - the VMDB - and one or more appliances running the evmserverd service with an associated set of configured worker processes. Regions are often used for organisational or geographical separation of resources, and the choice of region count, location and size is often based on both operational and technical factors.

3.1.1. Region Size

All CFME appliances or pods in a region access the same PostgreSQL database, and so the I/O and CPU performance of the database server is a significant factor in determining the maximum size to which a region can grow (in terms of numbers of managed objects) whilst maintaining acceptable performance.

3.1.1.1. Database Load Factors

The VMDB database load is determined by many factors including:

  • The total number of managed objects (for example images, nodes, routes, port configs, projects, services, pods, containers, etc.) in the region
  • The number of running or "active" objects generating events (typically nodes, pods and containers)
  • The number of OpenShift Container Platform providers/clusters added to the region
  • The overall "busyness" of the OpenShift Container Platform cluster, which determines the rate at which Kubernetes events are received and processed, and thus the rate at which inventory refreshes are requested and loaded
  • Whether or not Capacity and Utilization (C&U) metric collection is enabled for the region

    • The frequency of collection
  • Whether or not SmartState Analysis and OpenSCAP scanning is enabled for the region

    • The frequency of analysis
  • The complexity of reports and widgets, and frequency of generation
  • The frequency of running automate requests and tasks, including service requests
  • The number of control or compliance policies in use
  • The number of concurrent users accessing the "classic" WebUI, especially displaying large numbers of objects such as containers
  • The frequency and load profile of connections to the RESTful API (including the Self-Service UI)
  • The number of CFME appliances (more accurately, worker processes) in the region

3.1.1.2. Sizing Estimation

It is very difficult to define a representative load for simulation purposes, due to the many permutations of workload factors. Some analysis has been made of existing large CloudForms installations however, and it has been observed that for an "average" mix of the workload factors listed above, an optimally tuned and maintained PostgreSQL server should be able to handle the load from managing up to 20000 active or running objects (for example nodes, pods and containers). Larger regions than this are possible if the overall database workload is lighter, but as with any large database system, performance should be carefully monitored.

When sizing a region, some thought must be also given to the number of CloudForms worker processes that are likely to be needed to handle the expected workload, and hence the number of CFME appliances (or cloudforms-backend StatefulSet replicas) and active database connections.

When planning regions it is often useful to under-size a region rather than over-size. It is usually easier to add capacity to a smaller region that is performing well, than it is to split an under-performing large single region into multiple regions.

Note

A 'global' region is generally capable of handling considerably more objects as it has no active providers of its own, and has a lower database load.

3.1.1.3. Scaling Workers in a Podified CloudForms Region

An initial deployment of the podified distribution of CloudForms 4.6 starts a single cloudforms StatefulSet, containing all workers. This is the functional equivalent of a single CFME appliance. The numbers of workers in this StatefulSet replica can be increased up to their limits, after which there will be a requirement to add further pods to scale out the numbers of worker processes (the equivalent of adding further CFME appliances). For this the cloudforms-backend StatefulSet should be used, which contains no externally accessible Web or UI components.

3.1.2. Region Design

There are a number of considerations for region design and layout, but the most important are the anticipated number of managed objects (discussed above), and the location of the components being managed.

3.1.2.1. Centrally Located Infrastructure

With a single, centrally located small or medium sized OpenShift Container Platform cluster, the selection of region design is simpler. A single region is usually the most suitable option, with high availability and fault tolerance built into the design.

3.1.2.2. Distributed Infrastructure

With several distributed or large OpenShift Container Platform clusters the most obvious choice of region design might seem to be to allocate a region to each distributed location or cluster, however there are a number of advantages to both single and multi-region implementations for distributed infrastructures.

3.1.2.2.1. Wide Area Network Factors - Intra-Region

Network latency between CFME appliances and the database within a region plays a big factor in overall CloudForms "system" responsiveness. There is a utility, db_ping, supplied on each CFME appliance that can check the latency between an existing appliance and its own regional database. It is run as follows:

vmdb
tools/db_ping.rb
0.358361 ms
1.058845 ms
0.996966 ms
1.029908 ms
1.048192 ms

Average: 0.898454 ms
Note

On CFME versions prior to 5.8, this tool should be prefixed by bin/rails runner, for example:

bin/rails runner tools/db_ping.rb

The architecture of CloudForms assumes LAN-speed latency (≈ 1 ms) between CFME appliances and their regional database for optimal performance. As latency increases, so overall system responsiveness decreases.

Typical symptoms of a high latency connection are as follows:

  • WebUI operations appear to be slow, especially viewing screens that display a large number of objects such as VMs
  • Database-intensive actions such as complex report or widget generation take longer to run
  • CFME appliance restarts are slower since the startup seeding acquires an exclusive lock.
  • Worker tasks such as EMS refresh or C&U metrics collection that load data into the VMDB run more slowly

    • Longer EMS refreshes may have a detrimental effect on other operations such as detecting short-lived pods.
    • Metrics collection might not keep up with Prometheus metrics retention time.[7]

When considering deploying a CloudForms region spanning a WAN, it is important to establish acceptable performance criteria for the installation. Although in general a higher latency will result in slower but error-free performance, it has been observed that a latency of 5ms can cause the VMDB update transaction from an EMS refresh to timeout in very large regions. A latency as high as 42 ms can cause failures in database seeding operations.[8]

3.1.2.2.2. Wide Area Network Factors - Inter-Region

Network latency between subordinate and master regions is less critical as database replication occurs asynchronously. Latencies of 100 ms have been tested and shown to present no performance problems.

A second utility, db_ping_remote, is designed to check inter-region latency. It requires external PostgreSQL server details and credentials, and is run as follows:

tools/db_ping_remote.rb 10.3.0.22 5432 root vmdb_production
Enter the password for database user root on host 10.3.0.22
Password:
10.874407 ms
10.984994 ms
11.040376 ms
11.119602 ms
11.031609 ms

Average: 11.010198 ms
3.1.2.2.3. Single Region

Where WAN latency is deemed acceptable, the advantages of deploying a single region to manage all objects in a distributed infrastructure are as follows:

  • Simplified appliance upgrade procedures (no multiple regions or global region upgrade coordination issues)
  • Simplified disaster recovery when there is only one database to manage
  • Simpler architectural design, and therefore more straightforward operational procedures and documentation
  • Easier to manage the deployment of customisations such as automate code, policies, or reports (there is a single point of import)
3.1.2.2.4. Multi-Region

The advantages of deploying multiple regions to manage the objects in a distributed infrastructure are as follows:

  • Operational resiliency; no single point of failure to cause outage to the entire CloudForms managed environment
  • Continuous database maintenance runs faster in a smaller database
  • Database reorganisations (backup & restore) run faster and don’t take offline an entire CloudForms installation
  • More intuitive alignment between CloudForms WebUI view, and physical and virtual infrastructure
  • Reduced dependence on wide-area networking to maintain CloudForms performance
  • Region isolation (for performance)

    • Infrastructure issues such as event storms that might adversely affect the local region database will not impact any other region
    • Customisations can be tested in a development or test region before deploying to a production environment

3.1.3. Connecting Regions

As illustrated in Figure 3.1, “Regions and Zones” regions can be linked in such a way that several subordinate regions replicate their object data to a single global region. The global region has no providers of its own, and is typically used for enterprise-wide reporting as it has visibility of all objects. A new feature introduced with CloudForms 4.2 allows some management operations such as service provisioning to be performed directly from the global region, utilising a RESTful API connection to the correct child region to perform the action.

3.1.4. Region Numbering

Regions have associated with them a region number that is allocated when the VMDB appliance is first initialised. When several regions are linked in a global/subregion hierarchy, all of the region numbers must be unique. Region numbers can be up to three digits long, and the region number is encoded into the leading digits of every object ID in the region. For example the following 3 message IDs are from different regions:

  • Message id: [1000000933021] (region 1)
  • Message id: [9900023878436] (region 99)
  • Message id: [398451] (region 0)

Global regions are often allocated a higher region number (99 is frequently used) to distinguish them from subordinate regions whose numbers often start with 0 and increase as regions are added. There is no technical restriction on region number allocation in a connected multi-region CloudForms deployment, other than uniqueness.

3.1.5. Region Summary and Recommendations

The following guidelines can be used when designing a region topology:

  • Beware of over-sizing regions. Several slightly smaller interconnected regions will generally perform better than a single very large region
  • Network latency from CFME appliances or pods to the VMDB within the region should be close to LAN speed
  • Database performance is critical to the overall performance of the region
  • All CFME appliances in a region should be NTP synchronized to the same time source
  • Identify all external management system (EMS) host or hypervisor instances where steady-state or peak utilization > 50%, and avoid these hosts for placement of CFME appliances, especially the VMDB appliance.

3.2. Zones

Zones are a way of logically subdividing the resources and worker processing within a region. They perform a number of useful functions, particularly for larger CloudForms installations.

3.2.1. Zone Advantages

The following sections describe some of the advantages of implementing zones within a CloudForms region.

3.2.1.1. Provider Isolation

Zones are a convenient way of isolating providers (i.e. OpenShift Container Platform clusters). Each provider has a number of workers associated with it that run on any appliance running the Provider Inventory and Event Monitor roles. These include:

  • One Refresh worker
  • Two or more Metrics Collector workers
  • One Event Catcher

In addition to these provider-specific workers, the two roles add a further two worker types that handle the events and process the metrics for all providers in the zone:

  • One Event Handler
  • Two or more Metrics Processor workers

Each worker has a minimum startup cost of approximately 250-300MB, and the memory demands of each may vary depending on the number of managed objects for each provider. Having one provider per zone reduces the memory footprint of the workers running on the CFME appliances or pods in the zone, and allows for dedicated per-provider Event Handler and Metrics Processor workers. The prevents an event surge from one OpenShift Container Platform cluster from adversely affecting the handling of events from another cluster, for example.

3.2.1.2. Appliance Maintenance

Shutting down or restarting a CFME appliance or pod in a zone because of upgrade or update is less disruptive if only a single provider is affected.

3.2.1.3. Cluster-Specific Appliance Tuning

Zones allow for more predictable and provider-instance-specific sizing of CFME appliances and appliance settings based on the requirement of individual OpenShift Container Platform clusters. For example small clusters can have significantly different resourcing requirements to very large clusters, especially for C&U collection and processing.

3.2.1.4. VMDB Isolation

If the VMDB is running on a CFME appliance (as opposed to a dedicated or external PostgreSQL appliance), putting the VMDB appliance in its own zone is a convenient way to isolate the appliance from non database-related activities.

3.2.1.5. Logical Association of Resources

A zone is a natural and intuitive way of associating a provider with a corresponding set of physical or logical resources, either in the same or remote location. For example there might be a requirement to open firewall ports to enable access to a particular provider’s EMS on a restricted or remote network. Isolating the specific CFME appliances to their own zone simplifies this task.

Note

Not all worker processes are zone-aware. Some workers process messages originating from or relevant to the entire region

3.2.1.6. Improved and Simplified Diagnostics Gathering

Specifying a log depot per zone in Configuration → Settings allows log collection to be initiated for all appliances in the zone, in a single action. When requested, each appliance in the zone is notified to generate and deposit the specified logs into the zone-specific depot.

3.2.2. Number of CFME Appliances or Pods in a Zone

One of CloudForms' most resource-intensive tasks is metrics collection, performed by the C&U Data Collector workers. It has been established through testing that a single CFME 5.9.2 C&U Data Collector worker can retrieve and process the Hawkular metrics from approximately 1500 objects (750 pods, each with 1 container) in the default 50 minute capture_threshold period.

This 1:1500 guideline ratio is a convenient starting point for scaling the number of CFME appliance/pods required for a zone containing an OpenShift Container Platform provider. For example a default CFME appliance with 2 C&U Data Collector workers can manage approximately 3000 pods and/or containers. If the number of C&U Data Collector workers is increased to 4, the appliance should be able to manage approximately 6000 pods and/or containers (e.g. 3000 pods, each with 1 container; or 2000 pods, each with 2 containers).

Note

The limiting factor in determining C&U data collection performance can often be the configuration of Hawkular itself, or the resources given to the nodes running the openshift-infra project components. For example the number of CloudForms C&U Data Collector workers can be increased linearly by scaling out worker processes and CFME appliances, but each concurrent C&U Data Collector worker process contributes to the workload of the Hawkular and Cassandra pods, and therefore CPU, I/O and memory load on the nodes.

It has been observed that if the nodes hosting the openshift-infra project components are already highly utilised, adding further C&U Data Collector workers will have an adverse rather than beneficial affect on C&U collection performance.

3.2.3. Zone Summary and Recommendations

The following guidelines can be used when designing a zone topology:

  • Use a separate zone per OpenShift Container Platform cluster
  • Never span a zone across physical boundaries or locations
  • Use a minimum of two appliances per zone for resiliency of zone-aware workers and processes
  • Isolate the VMDB appliance in its own zone (unless it is a standalone PostgreSQL server)
  • At least one CFME appliance or pod in each zone should have the 'Automate Engine' role enabled, to process zone-specific events
  • At least once CFME appliance or pod in each zone should have the 'Provider Operations' role enabled to ensure that the service provision request tasks are processed correctly
  • Isolating the CFME appliances that general users interact with (running the User Interface and Web Services workers) into their own zone can allow for additional security measure to be taken to protect these servers

    • At least one CFME appliance in a WebUI zone should have the 'Reporting' role enabled to ensure that reports interactively scheduled by users are correctly processed (see Section 2.5.11, “Reporting” for more details)