Chapter 2. Planning a Distributed Compute Node (DCN) deployment

When you plan your DCN architecture, check that the technologies that you need are available and supported.

2.1. Considerations for storage on DCN architecture

The following features are not currently supported for DCN architectures:

  • Copying a volume snapshot between edge sites. You can work around this by creating an image from the volume and using glance to copy the image. After the image is copied, you can create a volume from it.
  • Ceph Rados Gateway (RGW) at the edge.
  • CephFS at the edge.
  • Instance high availability (HA) at the edge sites.
  • RBD mirroring between sites.
  • Instance migration, live or cold, either between edge sites, or from the central location to edge sites. You can still migrate instances within a site boundary. To move an image between sites, you must snapshot the image, and use glance image-import. For more information see Confirming image snapshots can be created and copied between sites.

Additionally, you must consider the following:

  • You must upload images to the central location before copying them to edge sites; a copy of each image must exist in the Image service (glance) at the central location.
  • You must use the RBD storage driver for the Image, Compute and Block Storage services.
  • For each site, assign a unique availability zone, and use the same value for the NovaComputeAvailabilityZone and CinderStorageAvailabilityZone parameters.
  • You can migrate an offline volume from an edge site to the central location, or vice versa. You cannot migrate volumes directly between edge sites.

2.2. Considerations for networking on DCN architecture

The following features are not currently supported for DCN architectures:

  • DHCP on DPDK nodes
  • Conntrack for TC Flower Hardware Offload

Conntrack for TC Flower Hardware Offload is available on DCN as a Technology Preview, and therefore using these solutions together is not fully supported by Red Hat. This feature should only be used with DCN for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.

The following ML2/OVS technologies are fully supported:

  • OVS-DPDK without DHCP on the DPDK nodes
  • SR-IOV
  • TC flower hardware offload, without conntrack
  • Neutron availability zones (AZs) with networker nodes at the edge, with one AZ per site
  • Routed provider networks

The following ML2/OVN networking technologies are fully supported:

  • OVS-DPDK without DHCP on the DPDK nodes
  • SR-IOV (without DHCP)
  • TC flower hardware offload, without conntrack
  • Routed provider networks
  • OVN GW (networker node) with Neutron AZs supported

    Important

    Ensure that all router gateway ports reside on the OpenStack Controller nodes by setting OVNCMSOptions: 'enable-chassis-as-gw' and by providing one or more AZ values for the OVNAvailabilityZone parameter. Performing these actions prevent the routers from scheduling all chassis as potential hosts for the router gateway ports. For more information, see Configuring Network service availability zones with ML2/OVN in Configuring Red Hat OpenStack Platform networking.

Additionally, you must consider the following:

  • Network latency: Balance the latency as measured in round-trip time (RTT), with the expected number of concurrent API operations to maintain acceptable performance. Maximum TCP/IP throughput is inversely proportional to RTT. You can mitigate some issues with high-latency connections with high bandwidth by tuning kernel TCP parameters.Contact Red Hat support if a cross-site communication exceeds 100 ms.
  • Network drop outs: If the edge site temporarily loses connection to the central site, then no OpenStack control plane API or CLI operations can be executed at the impacted edge site for the duration of the outage. For example, Compute nodes at the edge site are consequently unable to create a snapshot of an instance, issue an auth token, or delete an image. General OpenStack control plane API and CLI operations remain functional during this outage, and can continue to serve any other edge sites that have a working connection. Image type: You must use raw images when deploying a DCN architecture with Ceph storage.
  • Image sizing:

    • Overcloud node images - overcloud node images are downloaded from the central undercloud node. These images are potentially large files that will be transferred across all necessary networks from the central site to the edge site during provisioning.
    • Instance images: If there is no block storage at the edge, then the Image service images traverse the WAN during first use. The images are copied or cached locally to the target edge nodes for all subsequent use. There is no size limit for glance images. Transfer times vary with available bandwidth and network latency.

      If there is block storage at the edge, then the image is copied over the WAN asynchronously for faster boot times at the edge.

  • Provider networks: This is the recommended networking approach for DCN deployments. If you use provider networks at remote sites, then you must consider that the Networking service (neutron) does not place any limits or checks on where you can attach available networks. For example, if you use a provider network only in edge site A, you must ensure that you do not try to attach to the provider network in edge site B. This is because there are no validation checks on the provider network when binding it to a Compute node.
  • Site-specific networks: A limitation in DCN networking arises if you use networks that are specific to a certain site: When you deploy centralized neutron controllers with Compute nodes, there are no triggers in neutron to identify a certain Compute node as a remote node. Consequently, the Compute nodes receive a list of other Compute nodes and automatically form tunnels between each other; the tunnels are formed from edge to edge through the central site. If you use VXLAN or Geneve, every Compute node at every site forms a tunnel with every other Compute node and Controller node, whether or not they are local or remote. This is not an issue if you are using the same neutron networks everywhere. When you use VLANs, neutron expects that all Compute nodes have the same bridge mappings, and that all VLANs are available at every site.
  • Additional sites: If you need to expand from a central site to additional remote sites, you can use the openstack CLI on Red Hat OpenStack Platform director to add new network segments and subnets.
  • If edge servers are not pre-provisioned, you must configure DHCP relay for introspection and provisioning on routed segments.
  • Routing must be configured either on the cloud or within the networking infrastructure that connects each edge site to the hub. You should implement a networking design that allocates an L3 subnet for each Red Hat OpenStack Platform cluster network (external, internal API, and so on), unique to each site.