Chapter 6. The ceph-volume utility

As a storage administrator, you can prepare, create, and activate Ceph OSDs using the ceph-volume utility. The ceph-volume utility is a single purpose command-line tool to deploy logical volumes as OSDs. It uses a plugin-type framework to deploying OSDs with different device technologies. The ceph-volume utility follows a similar workflow of the ceph-disk utility for deploying OSDs, with a predictable, and robust way of preparing, activating, and starting OSDs. Currently, the ceph-volume utility only supports the lvm plugin, with the plan to support others technologies in the future.

Important

The ceph-disk command is deprecated.

6.1. Prerequisites

  • A running Red Hat Ceph Storage cluster.

6.2. Ceph volume lvm plugin

By making use of LVM tags, the lvm sub-command is able to store and re-discover by querying devices associated with OSDs so they can be activated. This includes support for lvm-based technologies like dm-cache as well.

When using ceph-volume, the use of dm-cache is transparent, and treats dm-cache like a logical volume. The performance gains and losses when using dm-cache will depend on the specific workload. Generally, random and sequential reads will see an increase in performance at smaller block sizes. While random and sequential writes will see a decrease in performance at larger block sizes.

To use the LVM plugin, add lvm as a subcommand to the ceph-volume command:

[root@osd ~]# ceph-volume lvm

There are three subcommands to the lvm subcommand, as follows:

Note

Using the create subcommand combines the prepare and activate subcommands into one subcommand.

Additional Resources

  • See the create subcommand section for more details.

6.3. Why does ceph-volume replace ceph-disk?

Previous versions of Red Hat Ceph Storage used the ceph-disk utility to prepare, activate, and create OSDs. Starting with Red Hat Ceph Storage 4, ceph-disk is replaced by the ceph-volume utility that aims to be a single purpose command-line tool to deploy logical volumes as OSDs, while maintaining a similar API to ceph-disk when preparing, activating, and creating OSDs.

How does ceph-volume work?

The ceph-volume is a modular tool that currently supports two ways of provisioning hardware devices, legacy ceph-disk devices and LVM (Logical Volume Manager) devices. The ceph-volume lvm command uses the LVM tags to store information about devices specific to Ceph and its relationship with OSDs. It uses these tags to later re-discover and query devices associated with OSDS so that it can activate them. It supports technologies based on LVM and dm-cache as well.

The ceph-volume utility uses dm-cache transparently and treats it as a logical volume. You might consider the performance gains and losses when using dm-cache, depending on the specific workload you are handling. Generally, the performance of random and sequential read operations increases at smaller block sizes; while the performance of random and sequential write operations decreases at larger block sizes. Using ceph-volume does not introduce any significant performance penalties.

Important

The ceph-disk utility is deprecated.

Note

The ceph-volume simple command can handle legacy ceph-disk devices, if these devices are still in use.

How does ceph-disk work?

The ceph-disk utility was required to support many different types of init systems, such as upstart or sysvinit, while being able to discover devices. For this reason, ceph-disk concentrates only on GUID Partition Table (GPT) partitions. Specifically on GPT GUIDs that label devices in a unique way to answer questions like:

  • Is this device a journal?
  • Is this device an encrypted data partition?
  • Was the device left partially prepared?

To solve these questions, ceph-disk uses UDEV rules to match the GUIDs.

What are disadvantages of using ceph-disk?

Using the UDEV rules to call ceph-disk can lead to a back-and-forth between the ceph-disk systemd unit and the ceph-disk executable. The process is very unreliable and time consuming and can cause OSDs to not come up at all during the boot process of a node. Moreover, it is hard to debug, or even replicate these problems given the asynchronous behavior of UDEV.

Because ceph-disk works with GPT partitions exclusively, it cannot support other technologies, such as Logical Volume Manager (LVM) volumes, or similar device mapper devices.

To ensure the GPT partitions work correctly with the device discovery workflow, ceph-disk requires a large number of special flags to be used. In addition, these partitions require devices to be exclusively owned by Ceph.

6.4. Preparing Ceph OSDs using ceph-volume

The prepare subcommand prepares an OSD back-end object store and consumes logical volumes (LV) for both the OSD data and journal. It does not modify the logical volumes, except for adding some extra metadata tags using LVM. These tags make volumes easier to discover, and they also identify the volumes as part of the Ceph Storage Cluster and the roles of those volumes in the storage cluster.

The BlueStore OSD backend supports the following configurations:

  • A block device, a block.wal device, and a block.db device
  • A block device and a block.wal device
  • A block device and a block.db device
  • A single block device

The prepare subcommand accepts a whole device or partition, or a logical volume for block.

Prerequisites

  • Root-level access to the OSD nodes.
  • Optionally, create logical volumes. If you provide a path to a physical device, the subcommand turns the device into a logical volume. This approach is simpler, but you cannot configure or change the way the logical volume is created.

Procedure

  1. Prepare the LVM volumes:

    Syntax

    ceph-volume lvm prepare --bluestore --data VOLUME_GROUP/LOGICAL_VOLUME

    Example

    [root@osd ~]# ceph-volume lvm prepare --bluestore --data example_vg/data_lv

    1. Optionally, if you want to use a separate device for RocksDB, specify the --block.db and --block.wal options:

      Syntax

      ceph-volume lvm prepare --bluestore --block.db --block.wal --data VOLUME_GROUP/LOGICAL_VOLUME

      Example

      [root@osd ~]# ceph-volume lvm prepare --bluestore --block.db --block.wal --data example_vg/data_lv

    2. Optionally, to encrypt data, use the --dmcrypt flag:

      Syntax

      ceph-volume lvm prepare --bluestore --dmcrypt --data VOLUME_GROUP/LOGICAL_VOLUME

      Example

      [root@osd ~]# ceph-volume lvm prepare --bluestore --dmcrypt --data example_vg/data_lv

Additional Resources

6.5. Activating Ceph OSDs using ceph-volume

The activation process enables a systemd unit at boot time, which allows the correct OSD identifier and its UUID to be enabled and mounted.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the Ceph OSD node.
  • Ceph OSDs prepared by the ceph-volume utility.

Procedure

  1. Get the OSD ID and UUID from an OSD node:

    [root@osd ~]# ceph-volume lvm list
  2. Activate the OSD:

    Syntax

    ceph-volume lvm activate --bluestore OSD_ID OSD_UUID

    Example

    [root@osd ~]# ceph-volume lvm activate --bluestore 0 0263644D-0BF1-4D6D-BC34-28BD98AE3BC8

    To activate all OSDs that are prepared for activation, use the --all option:

    Example

    [root@osd ~]# ceph-volume lvm activate --all

Additional Resources

6.6. Creating Ceph OSDs using ceph-volume

The create subcommand calls the prepare subcommand, and then calls the activate subcommand.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the Ceph OSD nodes.
Note

If you prefer to have more control over the creation process, you can use the prepare and activate subcommands separately to create the OSD, instead of using create. You can use the two subcommands to gradually introduce new OSDs into a storage cluster, while avoiding having to rebalance large amounts of data. Both approaches work the same way, except that using the create subcommand causes the OSD to become up and in immediately after completion.

Procedure

  1. To create a new OSD:

    Syntax

    ceph-volume lvm create --bluestore --data VOLUME_GROUP/LOGICAL_VOLUME

    Example

    [root@osd ~]# ceph-volume lvm create --bluestore --data example_vg/data_lv

Additional Resources

6.7. Using batch mode with ceph-volume

The batch subcommand automates the creation of multiple OSDs when single devices are provided.

The ceph-volume command decides the best method to use to create the OSDs, based on drive type. Ceph OSD optimization depends on the available devices:

  • If all devices are traditional hard drives, batch creates one OSD per device.
  • If all devices are solid state drives, batch creates two OSDs per device.
  • If there is a mix of traditional hard drives and solid state drives, batch uses the traditional hard drives for data, and creates the largest possible journal (block.db) on the solid state drive.
Note

The batch subcommand does not support the creation of a separate logical volume for the write-ahead-log (block.wal) device.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the Ceph OSD nodes.

Procedure

  1. To create OSDs on several drives:

    Syntax

    ceph-volume lvm batch --bluestore PATH_TO_DEVICE [PATH_TO_DEVICE]

    Example

    [root@osd ~]# ceph-volume lvm batch --bluestore /dev/sda /dev/sdb /dev/nvme0n1

Additional Resources