Chapter 5. Storage

5.1. Red Hat Ceph Storage

OpenStack private clouds leverage both block and object storage. Ceph is the preferred storage platform to provide these services.

The director provides the ability to configure extra features for an Overcloud. One of these extra features includes integration with Red Hat Ceph Storage. This includes both Ceph Storage clusters created with the director or existing Ceph Storage clusters.

Red Hat Ceph Storage is a distributed data object store designed to provide excellent performance, reliability, and scalability. Distributed object stores are the future of storage, because they accommodate unstructured data, and because clients can use modern object interfaces and legacy interfaces simultaneously. At the heart of every Ceph deployment is the Ceph Storage Cluster, which consists of two types of daemons:

  • Ceph Object Storage Daemon(OSD) - Ceph OSDs store data on behalf of Ceph clients. Additionally, Ceph OSDs utilize the CPU and memory of Ceph nodes to perform data replication, rebalancing, recovery, monitoring and reporting functions.
  • Ceph Monitor - A Ceph monitor maintains a master copy of the Ceph storage cluster map with the current state of the storage cluster. For more information about Red Hat Ceph Storage, see the Red Hat Ceph Storage Architecture Guide.

Red Hat OpenStack Platform director provides two main methods for integrating Red Hat Ceph Storage into an Overcloud.

  • Creating an Overcloud with its own Ceph Storage Cluster. The director has the ability to create a Ceph Storage Cluster during the creation on the Overcloud. The director creates a set of Ceph Storage nodes that use the Ceph OSD to store the data. In addition, the director installs the Ceph Monitor service on the Overcloud’s Controller nodes. This means if an organization creates an Overcloud with three highly available controller nodes, the Ceph Monitor also becomes a highly available service.
  • Integrating an Existing Ceph Storage into an Overcloud. If you already have an existing Ceph Storage Cluster, you can integrate this during an Overcloud deployment. This means you manage and scale the cluster outside of the Overcloud configuration.

In the NFV mobile validation lab, the first approach is used. The overcloud is created with its own Ceph storage cluster.

5.1.1. Ceph Node details:

To create Ceph cluster using Red Hat OpenStack Platform director the following requirements should be met for the Ceph nodes:

Processor 64-bit x86 processor with support for the Intel 64 or AMD64 CPU extensions.

Memory Memory requirements depend on the amount of storage space. Red Hat recommends a baseline of 16GB of RAM with additional 2GB of RAM per OSD.

Disk Space Storage requirements depends on OpenStack requirement: ephemeral storage, image storage and volumes.

Disk Layout The recommended Red Hat Ceph Storage node configuration requires a disk layout similar to the following:

  • /dev/sda - The root disk. The director copies the main Overcloud image to the disk.
  • /dev/sdb - The journal disk. This disk divides into partitions for Ceph OSD journals. For example, /dev/sdb1, /dev/sdb2, /dev/sdb3, and onward. The journal disk is usually a solid state drive (SSD) to aid with system performance.
  • /dev/sdc and onward - The OSD disks. Use as many disks as necessary for your storage requirements.
Note

The journal to osd ratio depends on the number of osd and the type of HDD for the journal (ssd/enterprise ssd/nvme).

If you need to set root_device, Ceph journals/OSDs, etc, for your deployment, you now have the nodes’ introspected data available for consumption. For instance:

[stack@undercloud-nfv-vepc2 ~]$ cd swift-data/                                                                                                                                           |
[stack@undercloud-nfv-vepc2 swift-data]$ for node in $(ironic node-list | awk '!/UUID/ {print $2}'); do echo "NODE: $node" ; cat inspector_data-$node | jq '.inventory.disks' ; echo "-----" ; done

NODE: 2a16ae03-ae92-41ba-937b-b31163f8536e
[
  {
    "size": 1000204886016,
    "rotational": true,
    "vendor": "SEAGATE",
    "name": "/dev/sda",
    "wwn_vendor_extension": null,
    "wwn_with_extension": "0x5000c5008f32caab",
    "model": "ST1000NX0333",
    "wwn": "0x5000c5008f32caab",
    "serial": "5000c5008f32caab"
  },
  {
    "size": 1000204886016,
    "rotational": true,
    "vendor": "SEAGATE",
    "name": "/dev/sdb",
    "wwn_vendor_extension": null,
    "wwn_with_extension": "0x5000c5008f33ca5f",
    "model": "ST1000NX0333",
    "wwn": "0x5000c5008f33ca5f",
    "serial": "5000c5008f33ca5f"
  },
  {
    "size": 1000204886016,
    "rotational": true,
    "vendor": "SEAGATE",
    "name": "/dev/sdc",
    "wwn_vendor_extension": null,
    "wwn_with_extension": "0x5000c5008f32a15f",
    "model": "ST1000NX0333",
    "wwn": "0x5000c5008f32a15f",
    "serial": "5000c5008f32a15f"
  },
  {
    "size": 1000204886016,
    "rotational": true,
    "vendor": "SEAGATE",
    "name": "/dev/sdd",
    "wwn_vendor_extension": null,
    "wwn_with_extension": "0x5000c5008f2e2e9b",
    "model": "ST1000NX0333",
    "wwn": "0x5000c5008f2e2e9b",
    "serial": "5000c5008f2e2e9b"
  },

{
    "size": 1000204886016,
    "rotational": true,
    "vendor": "SEAGATE",
    "name": "/dev/sde",
    "wwn_vendor_extension": null,
    "wwn_with_extension": "0x5000c5008f2be95b",
    "model": "ST1000NX0333",
    "wwn": "0x5000c5008f2be95b",
    "serial": "5000c5008f2be95b"
  },
  {
    "size": 1000204886016,
    "rotational": true,
    "vendor": "SEAGATE",
    "name": "/dev/sdf",
    "wwn_vendor_extension": null,
    "wwn_with_extension": "0x5000c5008f337f83",
    "model": "ST1000NX0333",
    "wwn": "0x5000c5008f337f83",
    "serial": "5000c5008f337f83"
  },



 {
    "size": 599013720064,
    "rotational": true,
    "vendor": "LSI",
    "name": "/dev/sdg",
    "wwn_vendor_extension": "0x200150182f9945c7",
    "wwn_with_extension": "0x6001636001aa98c0200150182f9945c7",
    "model": "MRROMB",
    "wwn": "0x6001636001aa98c0",
    "serial": "6001636001aa98c0200150182f9945c7"
  },
  {
    "size": 400088457216,
    "rotational": false,
    "vendor": null,
    "name": "/dev/nvme0n1",
    "wwn_vendor_extension": null,
    "wwn_with_extension": null,
    "model": "INTEL SSDPE2MD400G4",
    "wwn": null,
    "serial": "CVFT5481009D400GGN"
  },
  {
    "size": 400088457216,
    "rotational": false,
    "vendor": null,
    "name": "/dev/nvme1n1",
    "wwn_vendor_extension": null,
    "wwn_with_extension": null,
    "model": "INTEL SSDPE2MD400G4",
    "wwn": null,
    "serial": "CVFT548100AP400GGN"
  }
]

You can then use this data to set root devices on nodes (in this case, it is being set to /dev/sdg):

openstack baremetal node set --property root_device='{"serial": "6001636001aa98c0200150182f9945c7"}' 2a16ae03-ae92-41ba-937b-b31163f8536e

Additional configuration are set in storage-enviroment.yaml file as follows:

 #### CEPH SETTINGS ####

  ## Whether to deploy Ceph OSDs on the controller nodes. By default
  ## OSDs are deployed on dedicated ceph-storage nodes only.
  # ControllerEnableCephStorage: false

  ## When deploying Ceph Nodes through the oscplugin CLI, the following
  ## parameters are set automatically by the CLI. When deploying via
  ## heat stack-create or ceph on the controller nodes only,
  ## they need to be provided manually.

  ## Number of Ceph storage nodes to deploy
  # CephStorageCount: 0
  ## Ceph FSID, e.g. '4b5c8c0a-ff60-454b-a1b4-9747aa737d19'
  #CephClusterFSID: 'ab47eae6-83a4-4e2b-a630-9f3a4bb7f055'
  ## Ceph monitor key, e.g. 'AQC+Ox1VmEr3BxAALZejqeHj50Nj6wJDvs96OQ=='
  # CephMonKey: ''
  ## Ceph admin key, e.g. 'AQDLOh1VgEp6FRAAFzT7Zw+Y9V6JJExQAsRnRQ=='
  # CephAdminKey: ''
  #

  ExtraConfig:
    ceph::profile::params::osds:
      '/dev/sda':
         journal: '/dev/nvme0n1'
      '/dev/sdb':
         journal: '/dev/nvme0n1'
      '/dev/sdc':
         journal: '/dev/nvme0n1'
      '/dev/sdd':
         journal: '/dev/nvme1n1'
      '/dev/sde':
         journal: '/dev/nvme1n1'
      '/dev/sdf':
         journal: '/dev/nvme1n1'
Note

Erase all existing partitions on the disks targeted for journaling and OSDs before deploying the Overcloud. In addition, the Ceph Storage OSDs and journal disks require GPT disk labels, which can be configured as a part of the deployment.

Wiping disks is performed through first-boot.yaml using wipe_disk functions as shown below:

resources:
  userdata:
    type: OS::Heat::MultipartMime
    properties:
      parts:
      …
      …
      - config: {get_resource: wipe_disks}

 wipe_disks:
    type: OS::Heat::SoftwareConfig
    properties:
      config:
        #!/bin/bash
        if [[ `hostname` = *"ceph"* ]]
        then
          echo "Number of disks detected: $(lsblk -no NAME,TYPE,MOUNTPOINT | grep "disk" | awk '{print $1}' | wc -l)"
          for DEVICE in `lsblk -no NAME,TYPE,MOUNTPOINT | grep "disk" | awk '{print $1}'`
          do
            ROOTFOUND=0
            echo "Checking /dev/$DEVICE..."
            echo "Number of partitions on /dev/$DEVICE: $(expr $(lsblk -n /dev/$DEVICE | awk '{print $7}' | wc -l) - 1)"
            for MOUNTS in `lsblk -n /dev/$DEVICE | awk '{print $7}'`
            do
              if [ "$MOUNTS" = "/" ]
              then
                ROOTFOUND=1
              fi
            done
            if [ $ROOTFOUND = 0 ]
            then
              echo "Root not found in /dev/${DEVICE}"
              echo "Wiping disk /dev/${DEVICE}"
              sgdisk -Z /dev/${DEVICE}
              sgdisk -g /dev/${DEVICE}
            else
              echo "Root found in /dev/${DEVICE}"
            fi
          done
        fi

Things to note about Ceph storage deployment:

  • SSDs should be used for Ceph journals. This is done for speed, in order to support small writes as well as any bursty workloads. Ceph journals provides the speed and the consistency.
  • Ceph monitors will run as daemons on the controller nodes.
  • OSDs are made up six 1TB disks

    • No RAID is used here (“just a bunch of disks” or “JBODs”)