Chapter 7. Deploying storage at the edge

You can leverage Red Hat OpenStack Platform director to extend distributed compute node deployments to include distributed image management and persistent storage at the edge with the benefits of using Red Hat OpenStack Platform and Ceph Storage.

dcn arch

7.1. Roles for edge deployments with storage

The following roles are available for edge deployments with storage. Select the appropriate roles for your environment based on your chosen configuration.

7.1.1. Storage without hyperconverged nodes

When you deploy edge with storage, and are not deploying hyperconverged nodes, use one of the following four roles.

DistributedCompute
The DistributedCompute role is used for the first three compute nodes in storage deployments. The DistributedCompute role includes the GlanceApiEdge service, which ensures that Image services are consumed at the local edge site rather than at the central hub location. For any additional nodes use the DistributedComputeScaleOut role.
DistributedComputeScaleOut
The DistributedComputeScaleOut role includes the HAproxyEdge service, which enables instances created on the DistributedComputeScaleOut role to proxy requests for Image services to nodes that provide that service at the edge site. After you deploy three nodes with a role of DistributedCompute, you can use the DistributedComputeScaleOut role to scale compute resources. There is no minimum number of hosts required to deploy with the DistrubutedComputeScaleOut role.
CephAll
The CephAll role includes the Ceph OSD, Ceph mon, and Ceph Mgr services. You can deploy up to three nodes using the CephAll role. For any additional storage capacity use the CephStorage role.
CephStorage
The CephStorage role includes the Ceph OSD service. If three CephAll nodes do not provide enough storage capacity, then add as many CephStorage nodes as needed.

7.1.2. Storage with hyperconverged nodes

When you are deploying edge with storage, and you plan to have hyperconverged nodes that combine compute and storage, use one of the following two roles.

DistributedComputeHCI
The DistributedComputeHCI role enables a hyperconverged deployment at the edge by including Ceph Management and OSD services. You must use exactly three nodes when using the DistributedComputeHCI role.
DistributedComputeHCIScaleOut
The DistributedComputeHCIScaleOut role includes the Ceph OSD service, which allows storage capacity to be scaled with compute when more nodes are added to the edge. This role also includes the HAproxyEdge service to redirect image download requests to the GlanceAPIEdge nodes at the edge site. This role enables a hyperconverged deployment at the edge. You must use exactly three nodes when using the DistributedComputeHCI role.

7.2. Architecture of a DCN edge site with storage

To deploy DCN with storage you must also deploy Red Hat Ceph Storage at the central location. You must use the dcn-storage.yaml and cephadm.yaml environment files. For edge sites that include non-hyperconverged Red Hat Ceph Storage nodes, use the DistributedCompute, DistributedComputeScaleOut, CephAll, and CephStorage roles.

dcn with nonhci at edge example

With block storage at the edge
  • Red Hat Ceph Block Devices (RBD) is used as an Image (glance) service backend.
  • Multi-backend Image service (glance) is available so that images may be copied between the central and DCN sites.
  • The Block Storage (cinder) service is available at all sites and is accessed by using the Red Hat Ceph Block Devices (RBD) driver.
  • The Block Storage (cinder) service runs on the Compute nodes, and Red Hat Ceph Storage runs separately on dedicated storage nodes.
  • Nova ephemeral storage is backed by Ceph (RBD).

    For more information, see Section 5.2, “Deploying the central site with storage”.

7.3. Architecture of a DCN edge site with hyperconverged storage

To deploy this configuration you must also deploy Red Hat Ceph Storage at the central location. You need to configure the dcn-storage.yaml and cephadm.yaml environment files. Use the DistributedComputeHCI, and DistributedComputeHCIScaleOut roles. You can also use the DistributedComputeScaleOut role to add Compute nodes that do not participate in providing Red Hat Ceph Storage services.

dcn with hci at edge example

With hyperconverged storage at the edge
  • Red Hat Ceph Block Devices (RBD) is used as an Image (glance) service backend.
  • Multi-backend Image service (glance) is available so that images may be copied between the central and DCN sites.
  • The Block Storage (cinder) service is available at all sites and is accessed by using the Red Hat Ceph Block Devices (RBD) driver.
  • Both the Block Storage service and Red Hat Ceph Storage run on the Compute nodes.

    For more information, see Section 7.4, “Deploying edge sites with hyperconverged storage”.

When you deploy Red Hat OpenStack Platform in a distributed compute architecture, you have the option of deploying multiple storage topologies, with a unique configuration at each site. You must deploy the central location with Red Hat Ceph storage to deploy any of the edge sites with storage.

dcn with storage mixed example

7.4. Deploying edge sites with hyperconverged storage

After you deploy the central site, build out the edge sites and ensure that each edge location connects primarily to its own storage back end, as well as to the storage back end at the central location. A spine and leaf networking configuration should be included with this configuration, with the addition of the storage and storage_mgmt networks that ceph needs. For more information, see Spine Leaf Networking. You must have connectivity between the storage network at the central location and the storage network at each edge site so that you can move Image service (glance) images between sites.

Ensure that the central location can communicate with the mons and OSDs at each of the edge sites. However, you should terminate the storage management network at site location boundaries because the storage management network is used for OSD rebalancing.

Prerequisites

  • You must create the network_data.yaml file specific to your environment. You can find sample files in /usr/share/openstack-tripleo-heat-templates/network-data-samples.
  • You must create an overcloud-baremetal-deploy.yaml file specific to your environment. For more information see Provisioning bare metal nodes for the overcloud.
  • You have hardware for three Image Service (glance) servers at a central location and in each availability zone, or in each geographic location where storage services are required. At edge locations, the Image service is deployed to the DistributedComputeHCI nodes.

Procedure

  1. Log in to the undercloud as the stack user.
  2. Source the stackrc file:

    [stack@director ~]$ source ~/stackrc
  3. Generate an environment file ~/dcn0/dcn0-images-env.yaml:

    sudo openstack tripleo container image prepare \
    -e containers.yaml \
    --output-env-file /home/stack/dcn0/dcn0-images-env.yaml
  4. Generate the appropriate roles for the dcn0 edge location:

    openstack overcloud roles generate DistributedComputeHCI DistributedComputeHCIScaleOut \
    -o ~/dcn0/dcn0_roles.yaml
  5. Provision networks for the overcloud. This command takes a definition file for overcloud networks as input. You must use the output file in your command to deploy the overcloud:

    (undercloud)$ openstack overcloud network provision \
    --output /home/stack/dcn0/overcloud-networks-deployed.yaml \
    /home/stack/network_data.yaml
  6. Provision bare metal instances. This command takes a definition file for bare metal nodes as input. You must use the output file in your command to deploy the overcloud:

    (undercloud)$ openstack overcloud node provision \
    --stack dcn0 \
    --network-config \
    -o /home/stack/dcn0/deployed_metal.yaml \
    /home/stack/overcloud-baremetal-deploy.yaml
  7. If you are deploying the edge site with hyperconverged storage, you must create an initial-ceph.conf configuration file with the following parameters. For more information, see Configuring the Red Hat Ceph Storage cluster for HCI:

    [osd]
    osd_memory_target_autotune = true
    osd_numa_auto_affinity = true
    [mgr]
    mgr/cephadm/autotune_memory_target_ratio = 0.2
  8. Use the deployed_metal.yaml file as input to the openstack overcloud ceph deploy command. The openstack overcloud ceph deploy command outputs a yaml file that describes the deployed Ceph cluster:

    openstack overcloud ceph deploy \
    /home/stack/dcn0/deployed_metal.yaml \
    --stack dcn0 \
    --config ~/dcn0/initial-ceph.conf \ 1
    --output ~/dcn0/deployed_ceph.yaml \
    --container-image-prepare ~/containers.yaml \
    --network-data ~/network-data.yaml \
    --cluster dcn0 \
    --roles-data dcn_roles.yaml
    1
    Include initial-ceph.conf only when deploying hyperconverged infrastructure.
  9. Configure the naming conventions for your site in the site-name.yaml environment file. The Nova availability zone and the Cinder storage availability zone must match:

    parameter_defaults:
        NovaComputeAvailabilityZone: dcn0
        ControllerExtraConfig:
            nova::availability_zone::default_schedule_zone: dcn0
        NovaCrossAZAttach: false
        CinderStorageAvailabilityZone: dcn0
        CinderVolumeCluster: dcn0
        GlanceBackendID: dcn0
  10. Configure a glance.yaml template with contents similar to the following:

    parameter_defaults:
        GlanceEnabledImportMethods: web-download,copy-image
        GlanceBackend: rbd
        GlanceStoreDescription: 'dcn0 rbd glance store'
        GlanceBackendID: dcn0
        GlanceMultistoreConfig:
          central:
            GlanceBackend: rbd
            GlanceStoreDescription: 'central rbd glance store'
            CephClusterName: central
  11. Deploy the stack for the dcn0 location:[d]

    openstack overcloud deploy \
    --deployed-server \
    --stack dcn0 \
    --templates /usr/share/openstack-tripleo-heat-templates/ \
    -r ~/dcn0/dcn0_roles.yaml \
    -n ~/dcn0/network-data.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/dcn-storage.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm-rbd-only.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/nova-az-config.yaml \
    -e /home/stack/overcloud-deploy/central/central-export.yaml \
    -e /home/stack/dcn0/deployed_ceph.yaml \
    -e /home/stack/dcn-common/central_ceph_external.yaml \
    -e /home/stack/dcn0/overcloud-vip-deployed.yaml \
    -e /home/stack/dcn0/deployed_metal.yaml \
    -e /home/stack/dcn0/overcloud-networks-deployed.yaml \
    -e ~/control-plane/glance.yaml

7.5. Using a pre-installed Red Hat Ceph Storage cluster at the edge

You can configure Red Hat OpenStack Platform to use a pre-existing Ceph cluster. This is called an external Ceph deployment.

Prerequisites

  • You must have a preinstalled Ceph cluster that is local to your DCN site so that latency requirements are not exceeded.

Procedure

  1. Create the following pools in your Ceph cluster. If you are deploying at the central location, include the backups and metrics pools:

    [root@ceph ~]# ceph osd pool create volumes <_PGnum_>
    [root@ceph ~]# ceph osd pool create images <_PGnum_>
    [root@ceph ~]# ceph osd pool create vms <_PGnum_>
    [root@ceph ~]# ceph osd pool create backups <_PGnum_>
    [root@ceph ~]# ceph osd pool create metrics <_PGnum_>

    Replace <_PGnum_> with the number of placement groups. You can use the Ceph Placement Groups (PGs) per Pool Calculator to determine a suitable value.

  2. Create the OpenStack client user in Ceph to provide the Red Hat OpenStack Platform environment access to the appropriate pools:

    ceph auth add client.openstack mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=vms, allow rwx pool=images'

    Save the provided Ceph client key that is returned. Use this key as the value for the CephClientKey parameter when you configure the undercloud.

    Note

    If you run this command at the central location and plan to use Cinder backup or telemetry services, add allow rwx pool=backups, allow pool=metrics to the command.

  3. Save the file system ID of your Ceph Storage cluster. The value of the fsid parameter in the [global] section of your Ceph configuration file is the file system ID:

    [global]
    fsid = 4b5c8c0a-ff60-454b-a1b4-9747aa737d19
    ...

    Use this value as the value for the CephClusterFSID parameter when you configure the undercloud.

  4. On the undercloud, create an environment file to configure your nodes to connect to the unmanaged Ceph cluster. Use a recognizable naming convention, such as ceph-external-<SITE>.yaml where SITE is the location for your deployment, such as ceph-external-central.yaml, ceph-external-dcn1.yaml, and so on.

      parameter_defaults:
        # The cluster FSID
        CephClusterFSID: '4b5c8c0a-ff60-454b-a1b4-9747aa737d19'
        # The CephX user auth key
        CephClientKey: 'AQDLOh1VgEp6FRAAFzT7Zw+Y9V6JJExQAsRnRQ=='
        # The list of IPs or hostnames of the Ceph monitors
        CephExternalMonHost: '172.16.1.7, 172.16.1.8, 172.16.1.9'
        # The desired name of the generated key and conf files
        CephClusterName: dcn1
    1. Use the previously saved values for the CephClusterFSID and CephClientKey parameters.
    2. Use a comma delimited list of ip addresses from the Ceph monitors as the value for the CephExternalMonHost parameter.
    3. You must select a unique value for the CephClusterName parameter amongst edge sites. Reusing a name will result in the configuration file being overwritten.
  5. If you deployed Red Hat Ceph storage using Red Hat OpenStack Platform director at the central location, then you can export the ceph configuration to an environment file central_ceph_external.yaml. This environment file connects DCN sites to the central hub Ceph cluster, so the information is specific to the Ceph cluster deployed in the previous steps:

    sudo -E openstack overcloud export ceph \
    --stack central \
    --output-file /home/stack/dcn-common/central_ceph_external.yaml

    If the central location has Red Hat Ceph Storage deployed externally, then you cannot use the openstack overcloud export ceph command to generate the central_ceph_external.yaml file. You must create the central_ceph_external.yaml file manually instead:

    parameter_defaults:
      CephExternalMultiConfig:
        - cluster: "central"
          fsid: "3161a3b4-e5ff-42a0-9f53-860403b29a33"
          external_cluster_mon_ips: "172.16.11.84, 172.16.11.87, 172.16.11.92"
          keys:
            - name: "client.openstack"
              caps:
                mgr: "allow *"
                mon: "profile rbd"
                osd: "profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images"
              key: "AQD29WteAAAAABAAphgOjFD7nyjdYe8Lz0mQ5Q=="
              mode: "0600"
          dashboard_enabled: false
          ceph_conf_overrides:
            client:
              keyring: /etc/ceph/central.client.openstack.keyring
  6. Create an environment file with similar details about each site with an unmanaged Red Hat Ceph Storage cluster for the central location. The openstack overcloud export ceph command does not work for sites with unmanaged Red Hat Ceph Storage clusters. When you update the central location, this file will allow the central location the storage clusters at your edge sites as secondary locations

    parameter_defaults:
      CephExternalMultiConfig:
    cluster: dcn1
    …
    cluster: dcn2
    …
  7. Use the external-ceph.yaml, ceph-external-<SITE>.yaml, and the central_ceph_external.yaml environment files when deploying the overcloud:

    openstack overcloud deploy \
        --stack dcn1 \
        --templates /usr/share/openstack-tripleo-heat-templates/ \
        -r ~/dcn1/roles_data.yaml \
        -e /usr/share/openstack-tripleo-heat-templates/environments/external-ceph.yaml \
        -e /usr/share/openstack-tripleo-heat-templates/environments/dcn-storage.yaml \
        -e /usr/share/openstack-tripleo-heat-templates/environments/nova-az-config.yaml \
        -e /home/stack/dnc1/ceph-external-dcn1.yaml  \
        ....
        -e /home/stack/overcloud-deploy/central/central-export.yaml \
        -e /home/stack/dcn-common/central_ceph_external.yaml \
        -e /home/stack/dcn1/dcn_ceph_keys.yaml \
        -e /home/stack/dcn1/role-counts.yaml \
        -e /home/stack/dcn1/ceph.yaml \
        -e /home/stack/dcn1/site-name.yaml \
        -e /home/stack/dcn1/tuning.yaml \
        -e /home/stack/dcn1/glance.yaml
  8. Redeploy the central location after all edge locations have been deployed.

7.6. Updating the central location

After you configure and deploy all of the edge sites using the sample procedure, update the configuration at the central location so that the central Image service can push images to the edge sites.

Warning

This procedure restarts the Image service (glance) and interrupts any long running Image service process. For example, if an image is being copied from the central Image service server to a DCN Image service server, that image copy is interrupted and you must restart it. For more information, see Clearing residual data after interrupted Image service processes.

Procedure

  1. Create a ~/central/glance_update.yaml file similar to the following. This example includes a configuration for two edge sites, dcn0 and dcn1:

      parameter_defaults:
        GlanceEnabledImportMethods: web-download,copy-image
        GlanceBackend: rbd
        GlanceStoreDescription: 'central rbd glance store'
        CephClusterName: central
        GlanceBackendID: central
        GlanceMultistoreConfig:
          dcn0:
            GlanceBackend: rbd
            GlanceStoreDescription: 'dcn0 rbd glance store'
            CephClientUserName: 'openstack'
            CephClusterName: dcn0
            GlanceBackendID: dcn0
          dcn1:
            GlanceBackend: rbd
            GlanceStoreDescription: 'dcn1 rbd glance store'
            CephClientUserName: 'openstack'
            CephClusterName: dcn1
            GlanceBackendID: dcn1
  2. Create the dcn_ceph.yaml file. In the following example, this file configures the glance service at the central site as a client of the Ceph clusters of the edge sites, dcn0 and dcn1.

    openstack overcloud export ceph \
    --stack dcn0,dcn1 \
    --output-file ~/central/dcn_ceph.yaml
  3. Redeploy the central site using the original templates and include the newly created dcn_ceph.yaml and glance_update.yaml files.

    openstack overcloud deploy \
    --deployed-server \
    --stack central \
    --templates /usr/share/openstack-tripleo-heat-templates/ \
    -r ~/control-plane/central_roles.yaml \
    -n ~/network-data.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/dcn-storage.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/nova-az-config.yaml \
    -e /home/stack/central/overcloud-networks-deployed.yaml \
    -e /home/stack/central/overcloud-vip-deployed.yaml \
    -e /home/stack/central/deployed_metal.yaml \
    -e /home/stack/central/deployed_ceph.yaml \
    -e /home/stack/central/dcn_ceph.yaml \
    -e /home/stack/central/glance_update.yaml
openstack overcloud deploy \
       --stack central \
       --templates /usr/share/openstack-tripleo-heat-templates/ \
       -r ~/central/central_roles.yaml \
    ...
       -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm.yaml \
       -e /usr/share/openstack-tripleo-heat-templates/environments/nova-az-config.yaml \
       -e ~/central/central-images-env.yaml \
       -e ~/central/role-counts.yaml \
       -e ~/central/site-name.yaml
       -e ~/central/ceph.yaml \
       -e ~/central/ceph_keys.yaml \
       -e ~/central/glance.yaml \
       -e ~/central/dcn_ceph_external.yaml
  1. On a controller at the central location, restart the cinder-volume service. If you deployed the central location with the cinder-backup service, then restart the cinder-backup service too:

    ssh tripleo-admin@controller-0 sudo pcs resource restart openstack-cinder-volume
    ssh tripleo-admin@controller-0 sudo pcs resource restart openstack-cinder-backup

7.6.1. Clearing residual data after interrupted Image service processes

When you restart the central location, any long-running Image service (glance) processes are interrupted. Before you can restart these processes, you must first clean up residual data on the Controller node that you rebooted, and in the Ceph and Image service databases.

Procedure

  1. Check and clear residual data in the Controller node that was rebooted. Compare the files in the glance-api.conf file for staging store with the corresponding images in the Image service database, for example <image_ID>.raw.

    • If these corresponding images show importing status, you must recreate the image.
    • If the images show active status, you must delete the data from staging and restart the copy import.
  2. Check and clear residual data in Ceph stores. The images that you cleaned from the staging area must have matching records in their stores property in the Ceph stores that contain the image. The image name in Ceph is the image id in the Image service database.
  3. Clear the Image service database. Clear any images that are in importing status from the import jobs there were interrupted:

    $ glance image-delete <image_id>

7.7. Deploying Red Hat Ceph Storage Dashboard on DCN

Procedure

To deploy the Red Hat Ceph Storage Dashboard to the central location, see Adding the Red Hat Ceph Storage Dashboard to an overcloud deployment. These steps should be completed prior to deploying the central location.

To deploy Red Hat Ceph Storage Dashboard to edge locations, complete the same steps that you completed for central, however you must complete the following following:

  • Ensure that the ManageNetworks parameter has a value of false in your templates for deploying the edge site. When you set ManageNetworks to false, Edge sites will use the existing networks that were already created in the central stack:

    parameter_defaults:
      ManageNetworks: false
  • You must deploy your own solution for load balancing in order to create a high availability virtual IP. Edge sites do not deploy haproxy, nor pacemaker. When you deploy Red Hat Ceph Storage Dashboard to edge locations, the deployment is exposed on the storage network. The dashboard is installed on each of the three DistributedComputeHCI nodes with distinct IP addresses without a load balancing solution.

You can create an additional network to host virtual IP where the Ceph dashboard can be exposed. You must not be reusing network resources for multiple stacks. For more information on reusing network resources, see Reusing network resources in multiple stacks.

To create this additional network resource, use the provided network_data_dashboard.yaml heat template. The name of the created network is StorageDashboard.

Procedure

  1. Log in to Red Hat OpenStack Platform Director as stack.
  2. Generate the DistributedComputeHCIDashboard role and any other roles appropriate for your environment:

    openstack overcloud roles generate DistributedComputeHCIDashboard -o ~/dnc0/roles.yaml
  3. Include the roles.yaml and the network_data_dashboard.yaml in the overcloud deploy command:

    $ openstack overcloud deploy --templates \
    -r ~/<dcn>/<dcn_site_roles>.yaml \
    -n /usr/share/openstack-tripleo-heat-templates/network_data_dashboard.yaml \
    -e <overcloud_environment_files> \
    ...
    -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/cephadm-rbd-only.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/cephadm/ceph-dashboard.yaml \
Note

The deployment provides the three ip addresses where the dashboard is enabled on the storage network.

Verification

To confirm the dashboard is operational at the central location and that the data it displays from the Ceph cluster is correct, see Accessing Ceph Dashboard.

You can confirm that the dashboard is operating at an edge location through similar steps, however, there are exceptions as there is no load balancer at edge locations.

  1. Retrieve dashboard admin login credentials specific to the selected stack:

    grep grafana_admin /home/stack/config-download/<stack>/cephadm/cephadm-extra-vars-heat.yml
  2. Within the inventory specific to the selected stack, /home/stack/config-download/<stack>/cephadm/inventory.yml, locate the DistributedComputeHCI role hosts list and save all three of the storage_ip values. In the example below the first two dashboard IPs are 172.16.11.84 and 172.16.11.87:

    DistributedComputeHCI:
      hosts:
        dcn1-distributed-compute-hci-0:
          ansible_host: 192.168.24.16
    ...
    storage_hostname: dcn1-distributed-compute-hci-0.storage.localdomain
    storage_ip: 172.16.11.84
    ...
        dcn1-distributed-compute-hci-1:
    ansible_host: 192.168.24.22
    ...
    storage_hostname: dcn1-distributed-compute-hci-1.storage.localdomain
    storage_ip: 172.16.11.87
  3. You can check that the Ceph Dashboard is active at one of these IP addresses if they are accessible to you. These IP addresses are on the storage network and are not routed. If these IP addresses are not available, you must configure a load balancer for the three IP addresses that you get from the inventory to obtain a virtual IP address for verification.