Chapter 5. Deploying storage at the edge

You can leverage Red Hat OpenStack Platform director to extend distributed compute node deployments to include distributed image management and persistent storage at the edge with the benefits of using Red Hat OpenStack Platform and Ceph Storage.

dcn arch

5.1. Limitations of distributed compute node (DCN) architecture

The following features are not currently supported for DCN architectures:

  • Fast forward updates (FFU) on a distributed compute node architecture from Red Hat OpenStack Platform 13 to 16.
  • Non-hyperconverged storage nodes at edge sites.
  • Copying a volume snapshot between edge sites. You can work around this by creating an image from the volume and using glance to copy the image. Once the image is copied, you can create a volume from it.
  • Migrating or retyping a volume between sites.
  • Ceph Dashboard at the edge.
  • Ceph Rados Gateway (RGW) at the edge.
  • CephFS at the edge.
  • Block storage (cinder) backup for edge volumes.
  • Instance high availability (HA) at the edge sites.
  • Mixed storage environments, wherein only some edge sites are deployed with Ceph as a storage backend.
  • Live migration between edge sites or from the central location to edge sites. You can still live migrate instances within a site boundary.
  • RBD mirroring between sites.

The following overlay networking technologies are available as a Technology Preview, and therefore are not fully supported by Red Hat. These features should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.

  • ML2/OVN
  • DPDK at the edge.

5.1.1. Requirements of storage edge architecture

  • A copy of each image must exist in the Image service at the central location.
  • Prior to creating an instance at an edge site, you must have a local copy of the image at that edge site.
  • Source the centralrc authentication file to schedule workloads at edge sites as well as at the central location. Authentication files that are automatically generated for edge sites are not needed.
  • Images uploaded to an edge site must be copied to the central location before they can be copied to other edge sites.
  • Use the Image service RBD driver for all edge sites. Mixed architecture is not supported.
  • Multistack must be used with a single stack at each site.
  • RBD must be the storage driver for the Image, Compute and Block Storage services.
  • For each site, you must assign the same value to the NovaComputeAvailabilityZone and CinderStorageAvailabilityZone parameters.

5.2. Deploying the central site

To deploy the Image service with multiple stores and Ceph Storage as the back end, complete the following steps:

Prerequisites

  • Hardware for a Ceph cluster at the hub and in each availability zone, or in each geographic location where storage services are required.
  • You must deploy edge sites in a hyper converged architecture.
  • Hardware for three Image Service servers at the hub and in each availability zone, or in each geographic location where storage services are required.

The following is an example deployment of two or more stacks:

  • One stack at the central, or hub location, called central
  • One stack at an edge site called dcn0.
  • Additional stacks deployed similarly to dcn0, such as dcn1, dcn2, and so on.

Procedure

The following procedure outlines the steps for the initial deployment of the central location.

Note

The following steps detail the deployment commands and environment files associated with an example DCN deployment that uses the Image service with multiple stores. These steps do not include unrelated, but necessary, aspects of configuration, such as networking.

  1. In the home directory, create directories for each stack that you plan to deploy.

    mkdir /home/stack/central
    mkdir /home/stack/dcn0
    mkdir /home/stack/dcn1
  2. Generate a Ceph key.

    python3 -c 'import os,struct,time,base64; key = os.urandom(16); header = struct.pack("<hiih", 1, int(time.time()), 0, len(key)) ; print(base64.b64encode(header + key).decode())'
  3. Create the ceph_keys.yaml file for the central stack so that edge site OpenStack services you deploy later can connect. Use the Ceph key generated in the previous step as the key parameter value. This key is sensitive and should be protected.

    cat > /home/stack/central/ceph_keys.yaml << EOF
    parameter_defaults:
      CephExtraKeys:
          - name: "client.external"
            caps:
              mgr: "allow *"
              mon: "profile rbd"
              osd: "profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images"
            key: "AQD29WteAAAAABAAphgOjFD7nyjdYe8Lz0mQ5Q=="
            mode: "0600"
    EOF
  4. Set the name of the Ceph cluster, as well as configuration parameters relative to the available hardware. For more information, see Configuring Ceph with Custom Config Settings:

    cat > /home/stack/central/ceph.yaml << EOF
    parameter_defaults:
      CephClusterName: central
      CephAnsibleDisksConfig:
      osd_scenario: lvm
      osd_objectstore: bluestore
        devices:
          - /dev/sda
          - /dev/sdb
      CephPoolDefaultSize: 3
      CephPoolDefaultPgNum: 128
    
    EOF
  5. Generate roles for the central location using roles appropriate for your environment:

    openstack overcloud roles generate Compute Controller CephStorage \
    -o ~/central/central_roles.yaml
    
    cat > /home/stack/central/role-counts.yaml << EOF
    parameter_defaults:
      ControllerCount: 3
      ComputeCount: 2
      CephStorage: 3
    EOF
  6. Generate an environment file ~/central/central-images-env.yaml

    openstack tripleo container image prepare \
    -e containers.yaml \
    --output-env-file ~/central/central-images-env.yaml
  7. Configure the naming conventions for your site in the site-name.yaml environment file. The Nova availability zone and the Cinder storage availability zone must match:

    cat > /home/stack/central/site-name.yaml << EOF
    parameter_defaults:
        NovaComputeAvailabilityZone: central
        ControllerExtraConfig:
            nova::availability_zone::default_schedule_zone: central
        NovaCrossAZAttach: false
        CinderStorageAvailabilityZone: central
        GlanceBackendID: central
    EOF
  8. Configure a glance.yaml template with contents similar to the following:

    parameter_defaults:
        GlanceEnabledImportMethods: web-download,copy-image
        GlanceBackend: rbd
        GlanceStoreDescription: 'central rbd glance store'
        GlanceBackendID: central
        CephClusterName: central
  9. After you prepare all of the other templates, deploy the central stack:

    openstack overcloud deploy \
           --stack central \
           --templates /usr/share/openstack-tripleo-heat-templates/ \
           -r ~/central/central_roles.yaml \
        ...
           -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
           -e ~/central/central-images-env.yaml \
           -e ~/central/role-counts.yaml \
           -e ~/central/nova-az.yaml
           -e ~/central/ceph.yaml \
           -e ~/central/ceph_keys.yaml \
           -e ~/central/glance.yaml
Note

You must include heat templates for the configuration of networking in your openstack overcloud deploy command. Designing for edge architecture requires spine and leaf networking. See Spine Leaf Networking for more details.

The ceph-ansible.yaml file is configured with the following parameters:

  • NovaEnableRbdBackend: true
  • GlanceBackend: rbd

When you use these settings together, the glance.conf parameter image_import_plugins is configured by heat to have a value image_conversion, automating the conversion of QCOW2 images with commands such as glance image-create-via-import --disk-format qcow2…

This is optimal for the Ceph RBD. If you want to disable image conversion, you can do so with the GlanceImageImportPlugin parameter:

   parameter_defaults:
     GlanceImageImportPlugin: []

5.3. Deploying edge sites with storage

After you deploy the central site, build out the edge sites and ensure that each edge location connects primarily to its own storage backend, as well as to the storage back end at the central location.

A spine and leaf networking configuration should be included with this configuration, with the addition of the storage and storage_mgmt networks that ceph needs. For more information see Spine leaf networking.

You must have connectivity between the storage network at the central location and the storage network at each edge site so that you can move glance images between sites.

Ensure that the central location can communicate with the mons and osds at each of the edge sites. However, you should terminate the storage management network at site location boundaries, because the storage management network is used for OSD rebalancing.

Procedure

  1. Export stack information from the central stack. You must deploy the central stack before running this command:

    openstack overcloud export \
            --config-download-dir /var/lib/mistral/central/ \
            --stack central \
            --output-file ~/dcn-common/central-export.yaml
    Note

    The config-download-dir value defaults to /var/lib/mistral/<stack>/.

  2. Create the central_ceph_external.yaml file. This environment file connects DCN sites to the central hub Ceph cluster, so the information is specific to the Ceph cluster deployed in the previous steps.

    • The keys section should contain the same values that were passed to the CephExtraKeys parameter configured for the central location in step 1 of this procedure.
    • The value for external_cluster_mon_ips can be obtained from the tripleo-ansible-inventory.yaml file in the directory specified by the --config-download-dir parameter. Use the IP addresses or hostnames of the nodes that run the CephMons service.
    • Additional information such as the FSID can be obtained from the all.yaml file in the directory specified by the --config-download-dir parameter.
    • You must set the dashboard_enabled parameter to false when using the CephExternalMultiConfig parameter because you cannot deploy the Ceph dashboard when you configure an overcloud as a client of an external Ceph cluster. Relative to the edge site dcn0, central is an external Ceph cluster.

      cat > central_ceph_external.yaml << EOF
      parameter_defaults:
        CephExternalMultiConfig:
          - cluster: "central"
            fsid: "3161a3b4-e5ff-42a0-9f53-860403b29a33"
            external_cluster_mon_ips: "172.16.11.84, 172.16.11.87, 172.16.11.92"
            keys:
              - name: "client.external"
              caps:
                mgr: "allow *"
                mon: "profile rbd"
                osd: "profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images"
              key: "AQD29WteAAAAABAAphgOjFD7nyjdYe8Lz0mQ5Q=="
              mode: "0600"
            dashboard_enabled: false
            ceph_conf_overrides:
              client:
                keyring: /etc/ceph/central.client.external.keyring
      EOF
      Note

      Do not use the CephExternalMultiConfig parameter when you configure a single external Ceph cluster. This parameter is only supported when you deploy the following:

    • An external Ceph cluster, configured normally, in addition to multiple external Ceph clusters
    • An internal Ceph cluster, configured normally, in addition to multiple external Ceph clusters
  3. Create a new ceph_keys.yaml file for dcn0. Follow the same steps used at the beginning of the procedure to create central, but use a new Ceph key. For example:

    cat > ~/dcn0/ceph_keys.yaml <<EOF
    parameter_defaults:
      CephExtraKeys:
        - name: "client.external"
          caps:
          mgr: "allow *"
          mon: "profile rbd"
          osd: "profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images"
        key: "AQBO/mteAAAAABAAc4mVMTpq7OFtrPlRFqN+FQ=="
        mode: "0600"
    EOF
  4. Create the ~/dcn0/glance.yaml file for glance configuration overrides:

    parameter_defaults:
      GlanceEnabledImportMethods: web-download,copy-image
      GlanceBackend: rbd
      GlanceStoreDescription: 'dcn0 rbd glance store'
      GlanceBackendID: dcn0
      GlanceMultistoreConfig:
        central:
          GlanceBackend: rbd
          GlanceStoreDescription: 'central rbd glance store'
          CephClientUserName: 'external'
          CephClusterName: central
  5. Configure the ceph.yaml file with configuration parameters relative to the available hardware. For more information, see Configuring Ceph with Custom Config Settings:

    cat > /home/stack/dcn0/ceph.yaml << EOF
    parameter_defaults:
      CephClusterName: dcn0
      CephAnsibleDisksConfig:
        osd_scenario: lvm
        osd_objectstore: bluestore
        devices:
          - /dev/sda
          - /dev/sdb
      CephPoolDefaultSize: 3
      CephPoolDefaultPgNum: 128
    EOF
  6. Implement system tuning by using a file that contains the following parameters tuned to the requirements of you environment:

    cat > /home/stack/dcn0/tuning.yaml << EOF
    parameter_defaults:
      CephAnsibleExtraConfig:
        is_hci: true
      CephConfigOverrides:
        osd_recovery_op_priority: 3
        osd_recovery_max_active: 3
        osd_max_backfills: 1
     ## Set relative to your hardware:
      # DistributedComputeHCIParameters:
      #   NovaReservedHostMemory: 181000
      # DistributedComputeHCIExtraConfig:
      #   nova::cpu_allocation_ratio: 8.2
    EOF
    • For more information about setting the values for the parameters CephAnsibleExtraConfig and DistributedComputeHCIParameters, see Configure resource allocation.
    • For more information about setting the values for the parameters CephPoolDefaultPgNum, CephPoolDefaultSize, and DistributedComputeHCIExtraConfig, see Configuring ceph storage cluster setting.
  7. Configure the naming conventions for your site in the site-name.yaml environment file. The Nova availability zone and the Cinder storage availability zone must match. The CinderVolumeCluster parameter is included when deploying an edge site with storage. This parameter is used when cinder-volume is deployed as active/active, which is required at edge sites. As a best practice, set the Cinder cluster name to match the availability zone:

    cat > /home/stack/central/site-name.yaml << EOF
    parameter_defaults:
        ...
        NovaComputeAvailabilityZone: dcn0
        NovaCrossAZAttach: false
        CinderStorageAvailabilityZone: dcn0
        CinderVolumeCluster: dcn0
  8. Generate the roles.yaml file to be used for the dcn0 deployment, for example:

    openstack overcloud roles generate DistributedComputeHCI DistributedComputeHCIScaleOut -o ~/dcn0/roles_data.yaml
  9. Set the number systems in each role by creating the ~/dcn0/roles-counts.yaml file with the desired values for each role.

    When using hyperconverged infrastructure (HCI), you must allocate three nodes to the DistributedComputeHCICount role to satisfy requirements for Ceph Mon and GlanceApiEdge services.

    parameter_defaults:
      ControllerCount: 0
      ComputeCount: 0
      DistributedComputeHCICount: 3
      DistributedComputeHCIScaleOutCount: 1
  10. Retrieve the container images for the edge site:

    openstack tripleo container image prepare \
    --environment-directory dcn0 \
    -r ~/dcn0/roles_data.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
    ...
    -e /home/stack/dcn-common/central-export.yaml \
    -e /home/stack/containers-prepare-parameter.yaml \
    --output-env-file ~/dcn0/dcn0-images-env.yaml
    Note

    You must include all environment files to be used for the deployment in the openstack tripleo container image prepare command.

  11. Deploy the edge site:

    openstack overcloud deploy \
        --stack dcn0 \
        --templates /usr/share/openstack-tripleo-heat-templates/ \
        -r ~/dcn0/roles_data.yaml \
        -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
        -e /usr/share/openstack-tripleo-heat-templates/environments/dcn-hci.yaml \
        -e ~/dnc0/dcn0-images-env.yaml \
        ....
        -e ~/dcn-common/central-export.yaml \
        -e ~/dcn0/central_ceph_external.yaml \
        -e ~/dcn0/dcn_ceph_keys.yaml \
        -e ~/dcn0/role-counts.yaml \
        -e ~/dcn0/ceph.yaml \
        -e ~/dcn0/site-name.yaml \
        -e ~/dcn0/tuning.yaml \
        -e ~/dcn0/glance.yaml
    Note

    You must include heat templates for the configuration of networking in your openstack overcloud deploy command. Designing for edge architecture requires spine and leaf networking. See Spine Leaf Networking for more details.

5.4. Creating additional distributed compute node sites

A new distributed compute node (DCN) site has its own directory of YAML files on the undercloud. For more information, see Section 3.10, “Managing separate heat stacks”. This procedure contains example commands.

Procedure

  1. As the stack user on the undercloud, create a new directory for dcn1:

    $ cd ~
    $ mkdir dcn1
  2. Copy the existing dcn0 templates to the new directory and replace the dcn0 strings with dcn1:

    $ cp dcn0/ceph.yaml dcn1/ceph.yaml
    $ sed s/dcn0/dcn1/g -i dcn1/ceph.yaml
    $ cp dcn0/overrides.yaml dcn1/overrides.yaml
    $ sed s/dcn0/dcn1/g -i dcn1/overrides.yaml
    $ sed s/"0-ceph-%index%"/"1-ceph-%index%"/g -i dcn1/overrides.yaml
    $ cp dcn0/deploy.sh dcn1/deploy.sh
    $ sed s/dcn0/dcn1/g -i dcn1/deploy.sh
  3. Review the files in the dcn1 directory to confirm that they suit your requirements.
  4. Verify that your nodes are available and in Provisioning state:

    $ openstack baremetal node list
  5. When your nodes are available, run the deploy.sh for the dcn1 site:

    $ bash dcn1/deploy.sh

5.5. Updating the central location

After you configure and deploy all of the edge sites using the sample procedure, update the configuration at the central location so that the central Image service can push images to the edge sites.

Procedure

  1. Create a ~/central/glance_update.yaml file similar to the following. This example includes a configuration for two edge sites, dcn0 and dcn1:

      parameter_defaults:
        GlanceEnabledImportMethods: web-download,copy-image
        GlanceBackend: rbd
        GlanceStoreDescription: 'central rbd glance store'
        CephClusterName: central
        GlanceBackendID: central
        GlanceMultistoreConfig:
        dcn0:
           GlanceBackend: rbd
          GlanceStoreDescription: 'dcn0 rbd glance store'
          CephClientUserName: 'glance'
          CephClusterName: dcn0
          GlanceBackendID: dcn0
        dcn1:
          GlanceBackend: rbd
          GlanceStoreDescription: 'dcn1 rbd glance store'
          CephClientUserName: 'glance'
          CephClusterName: dcn1
          GlanceBackendID: dcn1
  2. Create the dcn_ceph_external.yaml file. In the following example, this file configures the glance service at the central site as a client of the Ceph clusters of the edge sites, dcn0 and dcn1.

    You can find the value for external_cluster_mon_ips from the tripleo-ansible-inventory.yaml file located in the following directories. Use the IP addresses or hostnames of the nodes that run the CephMons service.

    • /var/lib/mistral/dcn0/tripleo-ansible-inventory.yaml
    • /var/lib/mistral/dcn1/tripleo-ansible-inventory.yaml

      Find additional required values for this template, such as the FSID and cluster name, in the following files:

    • /var/lib/mistral/dcn0/ceph-ansible/group_vars/all.yaml
    • /var/lib/mistral/dcn1/ceph-ansible/group_vars/all.yaml

        parameter_defaults:
          CephExternalMultiConfig:
            - cluster: "dcn0"
              fsid: "539e2b96-316e-4c23-b7df-035a3037ddd1"
              external_cluster_mon_ips: "172.16.11.61, 172.16.11.64, 172.16.11.66"
              keys:
                - name: "client.external"
                  caps:
                    mgr: "allow *"
                    mon: "profile rbd"
                    osd: "profile rbd pool=images"
                  key: "AQBO/mteAAAAABAAc4mVMTpq7OFtrPlRFqN+FQ=="
                  mode: "0600"
              dashboard_enabled: false
              ceph_conf_overrides:
                  client:
                      keyring: /etc/ceph/dcn0.client.external.keyring
            - cluster: "dcn1"
              fsid: "7504a91e-5a0f-4408-bb55-33c3ee2c67e9"
              external_cluster_mon_ips: "172.16.11.182, 172.16.11.185, 172.16.11.187"
              keys:
                - name: "client.external"
                  caps:
                    mgr: "allow *"
                    mon: "profile rbd"
                    osd: "profile rbd pool=images"
                  key: "AQACCGxeAAAAABAAHocX/cnygrVnLBrKiZHJfw=="
                  mode: "0600"
              dashboard_enabled: false
              ceph_conf_overrides:
                  client:
                      keyring: /etc/ceph/dcn1.client.external.keyring
  3. Redeploy the central site using the original templates and include the newly created dcn_ceph_external.yaml and glance_update.yaml files.