Chapter 3. New features

This section lists all major updates, enhancements, and new features introduced in this release of Red Hat Ceph Storage.

The main features added by this release are:

  • Containerized Cluster

    Red Hat Ceph Storage 5 supports only containerized daemons. It does not support non-containerized storage clusters. If you are upgrading a non-containerized storage cluster from Red Hat Ceph Storage 4 to Red Hat Ceph Storage 5, the upgrade process includes the conversion to a containerized deployment.

    For more information, see the Upgrading a Red Hat Ceph Storage cluster from RHCS 4 to RHCS 5 section in the Red Hat Ceph Storage Installation Guide for more details.

  • Cephadm

    Cephadm is a new containerized deployment tool that deploys and manages a Red Hat Ceph Storage 5.0 cluster by connecting to hosts from the manager daemon. The cephadm utility replaces ceph-ansible for Red Hat Ceph Storage deployment. The goal of Cephadm is to provide a fully-featured, robust, and well installed management layer for running Red Hat Ceph Storage.

    The cephadm command manages the full lifecycle of a Red Hat Ceph Storage cluster.

    The cephadm command can perform the following operations:

  • Bootstrap a new Ceph storage cluster.
  • Launch a containerized shell that works with the Ceph command-line interface (CLI).
  • Aid in debugging containerized daemons.

    The cephadm command uses ssh to communicate with the nodes in the storage cluster and add, remove, or update Ceph daemon containers. This allows you to add, remove, or update Red Hat Ceph Storage containers without using external tools.

    The cephadm command has two main components:

  • The cephadm shell launches a bash shell within a container. This enables you to run storage cluster installation and setup tasks, as well as to run ceph commands in the container.
  • The cephadm orchestrator commands enable you to provision Ceph daemons and services, and to expand the storage cluster.

    For more information, see the Red Hat Ceph Storage Installation Guide.

  • Management API

    The management API creates management scripts that are applicable for Red Hat Ceph Storage 5.0 and continues to operate unchanged for the version lifecycle. The incompatible versioning of the API would only happen across major release lines.

    For more information, see the Red Hat Ceph Storage Developer Guide.

  • Disconnected installation of Red Hat Ceph Storage

    Red Hat Ceph Storage 5.0 supports the disconnected installation and bootstrapping of storage clusters on private networks. A disconnected installation uses custom images and configuration files and local hosts, instead of downloading files from the network.

    You can install container images that you have downloaded from a proxy host that has access to the Red Hat registry, or by copying a container image to your local registry. The bootstrapping process requires a specification file that identifies the hosts to be added by name and IP address. Once the initial monitor host has been bootstrapped, you can use Ceph Orchestrator commands to expand and configure the storage cluster.

    See the Red Hat Ceph Storage Installation Guide for more details.

  • Ceph File System geo-replication

    Starting with the Red Hat Ceph Storage 5 release, you can replicate Ceph File Systems (CephFS) across geographical locations or between different sites. The new cephfs-mirror daemon does asynchronous replication of snapshots to a remote CephFS.

    See the Ceph File System mirrors section in the Red Hat Ceph Storage File System Guide for more details.

  • A new Ceph File System client performance tool

    Starting with the Red Hat Ceph Storage 5 release, the Ceph File System (CephFS) provides a top-like utility to display metrics on Ceph File Systems in realtime. The cephfs-top utility is a curses-based Python script that uses the Ceph Manager stats module to fetch and display client performance metrics.

    See the Using the cephfs-top utility section in the Red Hat Ceph Storage File System Guide for more details.

  • Monitoring the Ceph object gateway multisite using the Red Hat Ceph Storage Dashboard

    The Red Hat Ceph Storage dashboard can now be used to monitor an Ceph object gateway multisite configuration.

    After the multi-zones are set-up using the cephadm utility, the buckets of one zone is visible to other zones and other sites. You can also create, edit, delete buckets on the dashboard.

    See the Management of buckets of a multisite object configuration on the Ceph dashboard chapter in the Red Hat Ceph Storage Dashboard Guide for more details.

  • Improved BlueStore space utilization

    The Ceph Object Gateway and the Ceph file system (CephFS) stores small objects and files as individual objects in RADOS. With this release, the default value of BlueStore’s min_alloc_size for SSDs and HDDs is 4 KB. This enables better use of space with no impact on performance.

    See the OSD BlueStore chapter in the Red Hat Ceph Storage Administration Guide for more details.

3.1. Ceph Dashboard

A new Grafana Dashboard to display graphs for Ceph Object Gateway multi-site setup

With this release, a new Grafana dashboard is now available and displays graphs for Ceph Object Gateway multisite sync performance including two-way replication throughput, polling latency, and unsuccessful replications.

See the Monitoring Ceph object gateway daemons on the dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

The Prometheus Alertmanager rule triggers an alert for different MTU settings on the Red Hat Ceph Storage Dashboard

Previously, mismatch in MTU settings, which is a well-known cause of networking issues, had to be identified and managed using the command-line interface. With this release, when a node or a minority of them have an MTU setting that differs from the majority of nodes, an alert is triggered on the Red Hat Ceph Storage Dashboard. The user can either mute the alert or fix the MTU mismatched settings.

See the Management of Alerts on the Ceph dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

User and role management on the Red Hat Ceph Storage Dashboard

With this release, user and role management is now available. It allows administrators to define fine-grained role-based access control (RBAC) policies for users to create, update, list, and remove OSDs in a Ceph cluster.

See the Management of roles on the Ceph dashboard in the Red Hat Ceph Storage Dashboard Guide for more information.

The Red Hat Ceph Storage Dashboard now supports RBD v1 images

Previously, the Red Hat Ceph Storage Dashboard displayed and supported RBD v2 format images only.

With this release, users can now manage and migrate their v1 RBD images to v2 RBD images by setting the RBD_FORCE_ALLOW_V1 to 1.

See the Management of block devices using the Ceph dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

Users can replace the failed OSD on the Red Hat Ceph Storage Dashboard

With this release, users can identify and replace the failed OSD by preserving the OSD_ID of the OSDs on the Red Hat Ceph Storage Dashboard.

See Replacing the failed OSDs on the Ceph dashboard in the Red Hat Ceph Storage Dashboard Guide for more information.

Specify placement target when creating a Ceph Object Gateway bucket on the Red Hat Ceph Storage Dashboard

With this release, users can now specify a placement target when creating a Ceph Object Gateway bucket on the Red Hat Ceph Storage Dashboard.

See the Creating Ceph object gateway buckets on the dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

The Multi-Factor Authentication deletes feature is enabled on the Red Hat Ceph Storage Dashboard

With this release, users can now enable Multi-Factor Authentication deletes (MFA) for a specific bucket from the Ceph cluster on the Red Hat Ceph Storage Dashboard.

See the Editing Ceph object gateway buckets on the dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

The bucket versioning feature for a specific bucket is enabled on the Red Hat Ceph Storage Dashboard

With this release, users can now enable bucket versioning for a specific bucket on the Red Hat Ceph Storage Dashboard.

See the Editing Ceph object gateway buckets on the dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

The object locking feature for Ceph Object Gateway buckets is enabled on the Red Hat Ceph Storage Dashboard

With this release, users can now enable object locking for Ceph Object Gateway buckets on the Red Hat Ceph Storage Dashboard.

See the Creating Ceph object gateway buckets on the dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

The Red Hat Ceph Storage Dashboard has the vertical navigation bar

With this release, the vertical navigation bar is now available. The heartbeat icon on the Red Hat Ceph Storage Dashboard menu changes color based on the cluster status that is green, yellow, and red. Other menus for example Cluster>Monitoring and Block>Mirroring display a colored numbered icon that shows the number of warnings in that specific component.

The "box" page of the Red Hat Ceph Storage dashboard displays detailed information

With this release, the "box" page of Red Hat Ceph Storage Dashboard displays information about the Ceph version, the hostname where the ceph-mgr is running, username,roles, and the browser details.

Browser favicon displays the Red Hat logo with an icon for a change in the cluster health status

With this release, the browser favicon now displays the Red Hat logo with an icon that changes color based on cluster health status that is green, yellow, or red.

The error page of the Red Hat Ceph Storage Dashboard works as expected

With this release, the error page of the Red Hat Ceph Storage Dashboard is fixed and works as expected.

Users can view Cephadm workflows on the Red Hat Ceph Storage Dashboard

With this release, the Red Hat Ceph Storage displays more information on inventory such as nodes defined in the Ceph Orchestrator and services such as information on containers. The Red Hat Ceph Storage dashboard also allows the users to manage the hosts on the Ceph cluster.

See the Monitoring hosts of the Ceph cluster on the dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

Users can modify the object count and size quota on the Red Hat Ceph Storage Dashboard

With this release, the users can now set and modify the object count and size quota for a given pool on the Red Hat Ceph Storage Dashboard.

See the Creating pools on the Ceph dashboard section in the Red Hat Ceph Storage Dashboard Guide for more information.

Users can manage Ceph File system snapshots on the Red Hat Ceph Storage Dashboard

With this release, the users can now create and delete Ceph File System (CephFS) snapshots, and set and modify per-directory quotas on the Red Hat Ceph Storage Dashboard.

Enhanced account and password policies for the Red Hat Ceph Storage Dashboard

With this release, to comply with the best security standards, strict password and account policies are implemented. The user passwords need to comply with some configurable rules. User accounts can also be set to expire after a given amount of time, or be locked out after a number of unsuccessful log-in attempts.

Users can manage users and buckets on any realm, zonegroup or zone

With this release, users can now manage users and buckets not only on the default zone but any realm, zone group, or zone that they configure.

To manage multiple daemons on the Red Hat Ceph Storage Dashboard, see the Management of buckets of a multi-site object gateway configuration on the Ceph dashboard in the Red Hat Ceph Storage Dashboard Guide.

Users can create a tenanted S3 user intuitively on the Red Hat Ceph Storage Dashboard

Previously, a tenanted S3 user could be created using a user friendly syntax that is "tenant$user" instead of the intuitive separate input fields for each one.

With this release, users can now create a tenanted S3 user intuitively without using "tenant$user" on the Red Hat Ceph Storage Dashboard.

The Red Hat Ceph Storage Dashboard now supports host management

Previously, the command-line interface was used to manage hosts in a Red Hat Ceph Storage cluster.

With this release, users can enable or disable the hosts by using the maintenance mode feature on the Red Hat Ceph Storage Dashboard.

Nested tables can be expanded or collapsed on the Red Hat Ceph Storage Dashboard

With this release, rows that contain nested tables can be expanded or collapsed by clicking on the row on the Red Hat Ceph Storage Dashboard.

3.2. Ceph File System

CephFS clients can now reconnect after being blocklisted by Metadata Servers (MDS)

Previously, Ceph File System (CephFS) clients were blocklisted by MDS because of network partitions or other transient errors.

With this release, the CephFS client can reconnect to the mount with the appropriate configurations turned ON for each client as manual remount is not needed.

Users can now use the ephemeral pinning policies for automated distribution of subtrees among MDS

With this release, the export pins are improved by introducing efficient strategies to pin subtrees, thereby enabling automated distribution of subtrees among Metadata Servers (MDS) and eliminating user intervention for manual pinning.

See the Ephemeral pinning policies section in the Red Hat Ceph Storage File System Guide for more information.

mount.ceph has an additional option of recover_session=clean

With this release, an additional option of recover_session=clean is added to mount.ceph. With this option, the client reconnects to the Red Hat Ceph Storage cluster automatically when it detects that it is blocklisted by Metadata servers (MDS) and the mounts are recovered automatically.

See the Removing a Ceph File System client from the blocklist section in the Red Hat Ceph Storage File System Guide for more information.

Asynchronous creation and removal of metadata operations in the Ceph File System

With this release, Red Hat Enterprise Linux 8.4 kernel mounts now asynchronously execute file creation and removal on Red Hat Ceph Storage clusters. This improves performance of some workloads by avoiding round-trip latency for these system calls without impacting consistency. Use the new -o nowsync mount option to enable asynchronous file creation and deletion.

Ceph File System (CephFS) now provides a configuration option for MDS called mds_join_fs

With this release, when failing over metadata server (MDS) daemons, the cluster’s monitors prefer standby daemons with mds_join_fs equal to the file system name with the failed rank.

If no standby exists with mds_join_fs equal to the file system name, it chooses an unqualified standby for the replacement, or any other available standby, as a last resort.

See the File system affinity section in the Red Hat Ceph Storage File System Guide for more information.

Asynchronous replication of snapshots between Ceph Filesystems

With this release, the mirroring module, that is the manager plugin, provides interfaces for managing directory snapshot mirroring. The mirroring module is responsible for assigning directories to the mirror daemons for the synchronization. Currently, a single mirror daemon is supported and can be deployed using cephadm.

Ceph File System (CephFS) supports asynchronous replication of snapshots to a remote CephFS through the cephfs-mirror tool. A mirror daemon can handle snapshot synchronization for multiple file systems in a Red Hat Ceph Storage cluster. Snapshots are synchronized by mirroring snapshot data followed by creating a snapshot with the same name for a given directory on the remote file system, as the snapshot being synchronized.

See the Ceph File System mirrors section in the Red Hat Ceph Storage File System Guide for more information.

The cephfs-top tool is supported

With this release, the cephfs-top tool is introduced.

Ceph provides a top(1) like utility to display the various Ceph File System(CephFS) metrics in realtime. The cephfs-top is a curses based python script that uses the stats plugin in the Ceph Manager to fetch and display the metrics.

CephFS clients periodically forward various metrics to the Ceph Metadata Servers (MDSs), which then forward these metrics to MDS rank zero for aggregation. These aggregated metrics are then forwarded to the Ceph Manager for consumption.

Metrics are divided into two categories; global and per-mds. Global metrics represent a set of metrics for the file system as a whole for example client read latency, whereas per-mds metrics are for a specific MDS rank for example the number of subtrees handled by an MDS.

Currently, global metrics are tracked and displayed. The cephfs-top command does not work reliably with multiple Ceph File Systems.

See the Using the cephfs-top utility section in the Red Hat Ceph Storage File System Guide for more information.

MDS daemons can be deployed with mds_autoscaler plugin

With this release, a new ceph-mgr plugin, mds_autoscaler is available which deploys metadata server (MDS) daemons in response to the Ceph File System (CephFS) requirements. Once enabled, mds_autoscaler automatically deploys the required standbys and actives according to the setting of max_mds.

For more information, see the Using the MDS autoscaler module section in Red Hat Ceph Storage File System Guide.

Ceph File System (CephFS) scrub now works with multiple active MDS

Previously, users had to set the parameter max_mds=1 and wait for only one active metadata server (MDS) to run Ceph File System (CephFS) scrub operations.

With this release, irrespective of the value of mds_max, users can execute scrub on rank 0 with multiple active MDS.

See the Configuring multiple active Metadata Server daemons section in the Red Hat Ceph Storage File System Guide for more information.

Ceph File System snapshots can now be scheduled with snap_schedule plugin

With this release, a new ceph-mgr plugin, snap_schedule is now available for scheduling snapshots of the Ceph File System (CephFS). The snapshots can be created, retained, and automatically garbage collected.

3.3. Containers

The cephfs-mirror package is included in the ceph-container ubi8 image

With this release, the cephfs-mirror package is now included in the ceph-container ubi8 image to support the mirroring Ceph File System (CephFS) snapshots to a remote CephFS. The command to configure CephFS-mirror is now available.

See the Ceph File System mirrors section in the Red Hat Ceph Storage File System Guide for more information.

3.4. Ceph Object Gateway

Bucket name or ID is supported in the radosgw-admin bucket stats command.

With this release, the bucket name or ID can be used as an argument in the radosgw-admin bucket stats command. Bucket stats reports the non-current bucket instances which can be used in debugging a class of large OMAP object warnings that is the Ceph OSD log.

Six new performance counters added to the Ceph Object Gateway’s perfcounters

With this release, six performance counters are now available in the Ceph Object Gateway. These counters report on the object expiration and lifecycle transition activity through the foreground and background processing of the Ceph Object Gateway lifecycle system. The lc_abort_mpu, lc_expire_current, lc_expire_noncurrent and lc_expire_dm counters permit the estimation of object expiration. The lc_transition_current and lc_transition_noncurrent counters provide information for lifecycle transitions.

Users can now use object lock to implement WORM-like functionality in S3 object storage

The S3 Object lock is the key mechanism supporting write-once-read-many (WORM) functionality in S3 Object storage. With this release, Red Hat Ceph Storage 5 supports Amazon Web Services (AWS) S3 Object lock data management API and the users can use Object lock concepts like retention period, legal hold, and bucket configuration to implement WORM-like functionality as part of the custom workflow overriding data deletion permissions.

3.5. RADOS

The Red Hat Ceph Storage recovers with fewer OSDs available in an erasure coded (EC) pool

Previously, erasure coded (EC) pools of size k+m required at least k+1 copies for recovery to function. If only k copies were available, recovery would be incomplete.

With this release, Red Hat Ceph Storage cluster now recovers with k or more copies available in an EC pool.

For more information on erasure coded pools, see the Erasure coded pools chapter in the Red Hat Ceph Storage Storage Strategies Guide.

Sharding of RocksDB database using column families is supported

With the BlueStore admin tool, the goal is to achieve less read and write amplification, decrease DB (Database) expansion during compaction, and also improve IOPS performance.

With this release, you can reshard the database with the BlueStore admin tool. The data in RocksDB (DB) database is split into multiple Column Families (CF). Each CF has its own options and the split is performed according to type of data such as omap, object data, delayed cached writes, and PGlog.

For more information on resharding, see the Resharding the RocksDB database using the BlueStore admin tool section in the Red Hat Ceph Storage Administration Guide.

The mon_allow_pool_size_one configuration option can be enabled for Ceph monitors

With this release, users can now enable the configuration option mon_allow_pool_size_one. Once enabled, users have to pass the flag --yes-i-really-mean-it for osd pool set size 1, if they want to configure the pool size to 1.

The osd_client_message_cap option has been added back

Previously, the osd_client_message_cap option was removed. With this release, the osd_client_message_cap option has been re-introduced. This option helps control the maximum number of in-flight client requests by throttling those requests. Doing this can be helpful when a Ceph OSD flaps due to an overwhelming amount of client-based traffic.

Ceph messenger protocol is now updated to msgr v2.1.

With this release, a new version of Ceph messenger protocol, msgr v2.1, is implemented, which addresses several security, integrity and potential performance issues with the previous version, msgr v2.0. All Ceph entities, both daemons and clients, now default to msgr v2.1.

3.6. RADOS Block Devices (RBD)

Improved librbd small I/O performance

Previously, in an NVMe based Ceph cluster, there were limitations in the internal threading architecture resulting in a single librbd client struggling to achieve more than 20K 4KiB IOPS.

With this release, librbd is switched to an asynchronous reactor model on top of the new ASIO-based neorados API thereby increasing the small I/O throughput potentially by several folds and reducing latency.

Built in schedule for purging expired RBD images

Previously, the storage administrator could set up a cron-like job for the rbd trash purge command.

With this release, the built-in schedule is now available for purging expired RBD images. The rbd trash purge schedule add and the related commands can be used to configure the RBD trash to automatically purge expired images based on a defined schedule.

See the Defining an automatic trash purge schedule section in the Red Hat Ceph Storage Block Device Guide for more information.

Servicing reads of immutable objects with the new ceph-immutable-object-cache daemon

With this release, the new ceph-immutable-object-cache daemon can be deployed on a hypervisor node to service the reads of immutable objects, for example a parent image snapshot. The new parent_cache librbd plugin coordinates with the daemon on every read from the parent image, adding the result to the cache wherever necessary. This reduces latency in scenarios where multiple virtual machines are concurrently sharing a golden image.

For more information, see the Management of `ceph-immutable-object-cache`daemons chapter in the Red Hat Ceph Storage Block device guide.

Support for sending compressible or incompressible hints in librbd-based clients

Previously, there was no way to hint to the underlying OSD object store backend whether data is compressible or incompressible.

With this release, the rbd_compression_hint configuration option can be used to hint whether data is compressible or incompressible, to the underlying OSD object store backend. This can be done per-image, per-pool or globally.

See the Block device input and output options section in the Red Hat Ceph Storage Block Device Guide for more information.

Overriding read-from-replica policy in librbd clients is supported

Previously there was no way to limit the inter-DC/AZ network traffic, as when a cluster is stretched across data centers, the primary OSD may be on a higher latency and cost link in comparison with other OSDs in the PG.

With this release, the rbd_read_from_replica_policy configuration option is now available and can be used to send reads to a random OSD or to the closest OSD in the PG, as defined by the CRUSH map and the client location in the CRUSH hierarchy. This can be done per-image, per-pool or globally.

See the Block device input and output options section in the Red Hat Ceph Storage Block Device Guide for more information.

Online re-sparsification of RBD images

Previously, reclaiming space for image extents that are zeroed and yet fully allocated in the underlying OSD object store was highly cumbersome and error prone. With this release, the new rbd sparsify command can now be used to scan the image for chunks of zero data and deallocate the corresponding ranges in the underlying OSD object store.

ocf:ceph:rbd cluster resource agent supports namespaces

Previously, it was not possible to use ocf:ceph:rbd cluster resource agent for images that exist within a namespace.

With this release, the new pool_namespace resource agent parameter can be used to handle images within the namespace.

RBD images can be imported instantaneously

With the rbd import command, the new image becomes available for use only after it is fully populated.

With this release, the image live-migration feature is extended to support external data sources and can be used as an alternative to rbd import. The new image can be linked to local files, remote files served over HTTP(S) or remote Amazon S3-compatible buckets in raw, qcow or qcow2 formats and becomes available for use immediately. The image is populated as a background operation which can be run while it is in active use.

LUKS encryption inside librbd is supported

Layering QEMU LUKS encryption or dm-crypt kernel module on top of librbd suffers a major limitation that a copy-on-write clone image must use the same encryption key as its parent image. With this release, support for LUKS encryption has been incorporated within librbd. The new "rbd encryption format" command can now be used to format an image to a luks1 or luks2 encrypted format.

3.7. RBD Mirroring

Snapshot-based mirroring of RBD images

The journal-based mirroring provides fine-grained crash-consistent replication at the cost of double-write penalty where every update to the image is first recorded to the associated journal before modifying the actual image.

With this release, in addition to journal-based mirroring, snapshot-based mirroring is supported. It provides coarse-grained crash-consistent replication where the image is mirrored using the mirror snapshots which can be created manually or periodically with a defined schedule. This is supported by all clients and requires a less stringent recovery point objective (RPO).

3.8. iSCSI Gateway

Improved tcmu-runner section in the ceph status output

Previously, each iSCSI LUN was listed individually resulting in cluttering the ceph status output.

With this release, the ceph status command summarizes the report and shows only the number of active portals and the number of hosts.

3.9. The Ceph Ansible utility

The cephadm-adopt.yml playbook is idempotent

With this release, the cephadm-adopt.yml playbook is idempotent, that is the playbook can be run multiple times. If the playbook fails for any reason in the first attempt, you can rerun the playbook and it works as expected.

For more information, see the Upgrading from Red Hat Ceph Storage 4 to Red Hat Ceph Storage 5 using `ceph-ansible` section in the Red Hat Ceph Storage Installation Guide.