Chapter 2. Hardware

When you deploy Red Hat OpenStack Platform with distributed compute nodes, your control plane stays at the hub. Compute nodes at the hub site are optional. At edge sites, you can have the following:

  • Compute nodes
  • Hyperconverged nodes with both Compute services and Ceph storage

2.1. Limitations to consider

  • Network latency: The edge Compute nodes are integrated with the wider overcloud deployment and must meet certain network latency requirements, with a round-trip time (RTT) that does not exceed 100ms.
  • Network drop outs: If the edge site temporarily loses its connection, then no OpenStack control plane API or CLI operations can be executed at the impacted edge site for the duration of the outage. For example, Compute nodes at the edge site are consequently unable to create a snapshot of an instance, issue an auth token, or delete an image. General OpenStack control plane API and CLI operations remain functional during this outage, and can continue to serve any other edge sites that have a working connection.
  • Image sizing:

    • Overcloud node images - Overcloud node images are downloaded from the central undercloud node. These images are potentially large files that will be transferred across all necessary networks from the central site to the edge site during provisioning.
    • Instance images: If block storage is not deployed at the edge, then the Glance images will traverse the WAN during first use. The Glance images are copied or cached locally to the target edge nodes for all subsequent use. There is no size limit for glance images. Transfer times vary with available bandwidth and network latency.

      When block storage is deployed at the edge, the image is copied over the WAN asynchronously for faster boot times at the edge.

  • Provider networks: This is the recommended networking approach for DCN deployments. If you use provider networks at remote sites, then you must consider that neutron does not place any limits or checks on where you can attach available networks. For example, if you use a provider network only in edge site A, you will need to make sure you do not try to attach to the provider network in edge site B. This is because there are no validation checks on the provider network when binding it to a Compute node.
  • Site-specific networks: A limitation in DCN networking arises if you are using networks that are specific to a certain site: When deploying centralized neutron controllers with Compute nodes, there are no triggers in neutron to identify a certain Compute node as a remote node. Consequently, the Compute nodes receive a list of other Compute nodes and automatically form tunnels between each other; the tunnels are formed from edge to edge through the central site. If you are using VXLAN or Geneve, the result is that every Compute node at every site forms a tunnel with every other Compute node and Controller node whether or not they are actually local or remote. This is not an issue if you are using the same neutron networks everywhere. When using VLANs, neutron expects that all Compute nodes have the same bridge mappings, and that all VLANs are available at every site.
  • Additional sites: If you need to expand from a central site to additional remote sites, you can use the openstack cli on the undercloud to add new network segments and subnets.
  • Autonomy: There might be specific autonomy requirements for the edge site. This might vary depending on your requirements.

2.2. Networking

When designing the network for distributed compute node architecture, be aware of the supported technologies and constraints:

The following networking technologies are supported at the edge:

  • ML2/OVS
  • ML2/SR-IOV
  • ML2/OVN as a technology preview

The following fast datapaths for NFV are supported at the edge:

  • OVS-DPDK without Neutron DHCP services
  • SR-IOV
  • TC/Flower offload
Note

Fast datapaths at the edge require ML2/OVS.

  • If you deploy distributed storage, the maximum storage network latency between central and edge sites for Ceph RBD traffic is 100ms round-trip time (RTT).
  • If edge servers are not preprovisioned, you must configure DHCP relay for introspection and provisioning on routed segments.

Routing must be configured either on the cloud or within the networking infrastructure that connects each edge site to the hub. You should implement a networking design that allocates an L3 subnet for each Red Hat OpenStack Platform cluster network (external, internal API, and so on), unique to each site.

2.2.1. Routing between edge sites

If you need full mesh connectivity between both the central location and edge sites, as well as routing between the edge sites themselves, you must design a solution with inherent complexity, either on the network infrastructure or on the Red Hat OpenStack Platform nodes.

The following approaches satisfy full mesh connectivity between every logical site, both central and edge, for control plane signaling. Tenant (overlay) networks are terminated on site.

There are two ways to create full mesh connectivity between the central and edge sites:

2.2.1.1. Push complexity to the hardware network infrastructure

Allocate a supernet for each network function and allocate a block for each edge and leaf, then use dynamic routing on your network infrastructure to advertise and route each locally connected block.

The benefit of this procedure is that it requires only a single route on each OpenStack node per network function interface to reach the corresponding interfaces at local or remote edge sites.

  1. Reserve 16-Bit address blocks for OpenStack endpoints

    Provisioning: 10.10.0.0/16
    Internal API: 10.20.0.0/16
    Storage Front-End: 10.40.0.0/16
    Storage Back-End: 10.50.0.0/16
  2. Use smaller blocks from these to allocate addresses for each edge site or leaf. You can summarize a smaller block by using the following:

    10.10.`pod#`.0/24

    For example, the following could be used for a site designated as leaf 2.

    External:     10.1.2.0/24
    Provisioning: 10.10.2.0/24
    Internal API: 10.20.2.0/24
    Storage:      10.40.2.0/24
    Storage Mgmt: 10.50.2.0/24
    Provider:     10.60.2.0/24
  3. Define common static routes for function summaries. Consider the following example:

    Provisioning: 10.10.0.0/16 > 10.10.[pod#].1
    Internal API: 10.20.0.0/16 > 10.20.[pod#].1
    Storage Front-End: 10.40.0.0/16 > 10.40.[pod#].1
    Storage Back-End: 10.50.0.0/16 > 10.50.[pod#].1

2.2.1.2. Push complexity to the Red Hat OpenStack Platform cluster

  1. Allocate a route per edge site for each network function (internal api, overlay, storage, and so on) in the network_data.yaml file for the cluster:

    ###INTERNAL API NETWORKS
    - name: InternalApi
      name_lower: internal_api
      vip: true
      ip_subnet: '10.20.2.0/24'
      allocation_pools: [{'start': '10.20.2.100', 'end': '10.20.2.199'}]
      gateway_ip: '10.20.2.1'
      vlan: 0
      subnets:
        internal_api3
          ip_subnet: '10.20.3.0/24'
          allocation_pools: [{'start': '10.20.3.100', 'end': '10.20.3.199'}]
          vlan: 0
          gateway_ip: '10.20.3.1'
        internal_api4
          ip_subnet: '10.20.4.0/24'
          allocation_pools: [{'start': '10.20.4.100', 'end': '10.20.4.199'}]
          vlan: 0
          gateway_ip: '10.20.4.1'
    ...

This method allows you to more easily configure summarized static routing on the network infrastructure. Use dynamic routing on the networking infrastructure to further simplify configuration.