Chapter 2. Planning a Distributed Compute Node (DCN) deployment

When you plan your DCN architecture, check that the technologies that you need are available and supported.

2.1. Considerations for storage on DCN architecture

The following features are not currently supported for DCN architectures:

  • Fast forward updates (FFU) on a distributed compute node architecture from Red Hat OpenStack Platform 13 to 16.
  • Copying a volume snapshot between edge sites. You can work around this by creating an image from the volume and using glance to copy the image. After the image is copied, you can create a volume from it.
  • Migrating or retyping a volume between sites.
  • Ceph Rados Gateway (RGW) at the edge.
  • CephFS at the edge.
  • Instance high availability (HA) at the edge sites.
  • Live migration between edge sites or from the central location to edge sites. You can still live migrate instances within a site boundary.
  • RBD mirroring between sites.

Additionally, you must consider the following:

  • You must upload images to the central location before copying them to edge sites; a copy of each image must exist in the Image service (glance) at the central location.
  • Before you create an instance at an edge site, you must have a local copy of the image at that edge site.
  • You must use the RBD storage driver for the Image, Compute and Block Storage services.
  • For each site, assign a unique availability zone, and use the same value for the NovaComputeAvailabilityZone and CinderStorageAvailabilityZone parameters.

2.2. Considerations for networking on DCN architecture

The following features are not currently supported for DCN architectures:

  • Octavia
  • DHCP on DPDK nodes
  • Conntrack for TC Flower Hardware Offload

Conntrack for TC Flower Hardware Offload is available on DCN as a Technology Preview, and therefore using these solutions together is not fully supported by Red Hat. This feature should only be used with DCN for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.

The following ML2/OVS technologies are fully supported:

  • OVS-DPDK without DHCP on the DPDK nodes
  • SR-IOV
  • TC Flower hardware offload, without conntrack
  • Neutron availability zones (AZs) with networker nodes at the edge, with one AZ per site
  • Routed provider networks

The following ML2/OVN networking technologies are fully supported:

  • OVS-DPDK without DHCP on the DPDK nodes
  • SR-IOV (without DHCP
  • TC flower hardware offload, without conntrack
  • Router provider networks
  • OVN GW (networker node) with Neutron AZs supported

Additionally, you must consider the following:

  • Network latency: Balance the latency as measured in round-trip time (RTT), with the expected number of concurrent API operations to maintain acceptable performance. Maximum TCP/IP throughput is inversely proportional to RTT. You can mitigate some issues with high-latency connections with high bandwidth by tuning kernel TCP parameters.Contact Red Hat support if a cross-site communication exceeds 100 ms.
  • Network drop outs: If the edge site temporarily loses connection to the central site, then no OpenStack control plane API or CLI operations can be executed at the impacted edge site for the duration of the outage. For example, Compute nodes at the edge site are consequently unable to create a snapshot of an instance, issue an auth token, or delete an image. General OpenStack control plane API and CLI operations remain functional during this outage, and can continue to serve any other edge sites that have a working connection. Image type: You must use raw images when deploying a DCN architecture with Ceph storage.
  • Image sizing:

    • Overcloud node images - overcloud node images are downloaded from the central undercloud node. These images are potentially large files that will be transferred across all necessary networks from the central site to the edge site during provisioning.
    • Instance images: If there is no block storage at the edge, then the Image service images traverse the WAN during first use. The images are copied or cached locally to the target edge nodes for all subsequent use. There is no size limit for glance images. Transfer times vary with available bandwidth and network latency.

      If there is block storage at the edge, then the image is copied over the WAN asynchronously for faster boot times at the edge.

  • Provider networks: This is the recommended networking approach for DCN deployments. If you use provider networks at remote sites, then you must consider that the Networking service (neutron) does not place any limits or checks on where you can attach available networks. For example, if you use a provider network only in edge site A, you must ensure that you do not try to attach to the provider network in edge site B. This is because there are no validation checks on the provider network when binding it to a Compute node.
  • Site-specific networks: A limitation in DCN networking arises if you use networks that are specific to a certain site: When you deploy centralized neutron controllers with Compute nodes, there are no triggers in neutron to identify a certain Compute node as a remote node. Consequently, the Compute nodes receive a list of other Compute nodes and automatically form tunnels between each other; the tunnels are formed from edge to edge through the central site. If you use VXLAN or Geneve, every Compute node at every site forms a tunnel with every other Compute node and Controller node, whether or not they are local or remote. This is not an issue if you are using the same neutron networks everywhere. When you use VLANs, neutron expects that all Compute nodes have the same bridge mappings, and that all VLANs are available at every site.
  • Additional sites: If you need to expand from a central site to additional remote sites, you can use the openstack CLI on Red Hat OpenStack Platform director to add new network segments and subnets.
  • If edge servers are not pre-provisioned, you must configure DHCP relay for introspection and provisioning on routed segments.
  • Routing must be configured either on the cloud or within the networking infrastructure that connects each edge site to the hub. You should implement a networking design that allocates an L3 subnet for each Red Hat OpenStack Platform cluster network (external, internal API, and so on), unique to each site.

2.3. Storage topologies and roles at the edge

When you deploy Red Hat OpenStack platform with a distributed compute node architecture, you must decide if you need storage at the edge. Based on storage and performance needs, you can deploy each site with one of three configurations. Not all edge sites must have an identical configuration.

DCN without storage

To deploy this architecture, use the Compute role.

dcn with compute only example

Without block storage at the edge:

  • The Object Storage (swift) service at the control plane is used as an Image (glance) service backend.
  • Multi-backend image service is not available.

  • The instances are stored locally on the Compute nodes.
  • Volume services such as Block Storage (cinder) are not available at edge sites.

    Important: If you do not deploy the central location with Red Hat Ceph storage, you will not have the option of deploying an edge site with storage at a later time.

    For more information about deploying without block storage at the edge, see xfref:deploying-the-edge-without-storage_deploy-edge-without-storage[].

DCN with storage

To deploy DCN with storage you must also deploy Red Hat Ceph Storage at the central location. You need to use the dcn-storage.yaml and ceph-ansible.yaml environment files. For edge sites that include non-hyperconverged Red Hat Ceph Storage nodes, use the DistributedCompute, DistributedComputeScaleOut, CephAll, and CephStorage roles.

dcn with nonhci at edge example

With block storage at the edge:

  • Red Hat Ceph Block Devices (RBD) is used as an Image (glance) service backend.
  • Multi-backend Image service (glance) is available so that images may be copied between the central and DCN sites.
  • The Block Storage (cinder) service is available at all sites and is accessed by using the Red Hat Ceph Block Devices (RBD) driver.
  • The Block Storage (cinder) service runs on the Compute nodes, and Red Hat Ceph Storage runs separately on dedicated storage nodes.
  • Nova ephemeral storage is backed by Ceph (RBD).

    For more information, see Section 5.2, “Deploying the central site with storage”.

DCN with hyperconverged storage

To deploy this configuration you must also deploy Red Hat Ceph Storage at the central location. You need to configure the dcn-storage.yaml and ceph-ansible.yaml environment files. Use the DistributedComputeHCI, and DistributedComputeHCIScaleOut roles. You can also use the DistributedComputeScaleOut role to add Compute nodes that do not participate in providing Red Hat Ceph Storage services.

dcn with hci at edge example

With hyperconverged storage at the edge:

  • Red Hat Ceph Block Devices (RBD) is used as an Image (glance) service backend.
  • Multi-backend Image service (glance) is available so that images may be copied between the central and DCN sites.
  • The Block Storage (cinder) service is available at all sites and is accessed by using the Red Hat Ceph Block Devices (RBD) driver.
  • Both the Block Storage service and Red Hat Ceph Storage run on the Compute nodes.

    For more infomration see Section 7.1, “Deploying edge sites with storage”.

When you deploy Red Hat OpenStack Platform in a distributed compute architecture, you have the option of deploying multiple storage topologies, with a unique configuration at each site. You must deploy the central location with Red Hat Ceph storage to deploy any of the edge sites with storage.

dcn with storage mixed example

2.3.1. Roles for edge deployments

The following roles are available for edge deployments. Select the appropriate roles for your environment based on your chosen configuration.

Compute
The Compute role is used for edge deployments without storage.
DistributedCompute
The DistributedCompute role is used at the edge for storage deployments without hyperconverged nodes. The DistributedCompute role includes the GlanceApiEdge service, which ensures that Image services are consumed at the local edge site rather than at the central hub location. You can deploy up to three nodes using the DistributedCompute role. For any additional nodes use the DistributedComputeScaleOut role.
DistributedComputeScaleOut
The DistributedComputeScaleOut role includes the HAproxyEdge service, which enables instances created on the DistributedComputeScaleOut role to proxy requests for Image services to nodes that provide that service at the edge site. After you deploy three nodes with a role of DistributedCompute, you can use the DistributedComputeScaleOut role to scale compute resources. There is no minimum number of hosts required to deploy with the DistrubutedComputeScaleOut role. This role is used at the edge for storage deployments without hyperconverged nodes.
DistributedComputeHCI
The DistributedComputeHCI role enables a hyperconverged deployment at the edge by including Ceph Management and OSD services. You must use exactly three nodes when using the DistributedComputeHCI role. This role is used for storage deployments with fully converged nodes.
DistributedComputeHCIScaleOut
The DistributedComputeHCIScaleOut role includes the Ceph OSD service, which allows storage capacity to be scaled with compute when more nodes are added to the edge. This role also includes the HAproxyEdge service to redirect image download requests to the GlanceAPIEdge nodes at the edge site. This role enables a hyper converged deployment at the edge. You must use exactly three nodes when using the DistributedComputeHCI role. This role is used at the edge for storage deployments with hyperconverged nodes.
CephAll
The CephAll role includes the Ceph OSD, Ceph mon, and Ceph Mgr services. This role is used at the edge for storage deployments without hyperconverged nodes. You can deploy up to three nodes using the CephAll role. For any additional storage capacity use the CephStorage role.
CephStorage
The CephStorage role includes the Ceph OSD service. This role is used at the edge for storage deployments without hyperconverged nodes. If three CephAll nodes do not provide enough storage capacity, then add as many CephStorage nodes as needed.