Chapter 3. New features

This section lists all major updates, enhancements, and new features introduced in this release of Red Hat Ceph Storage.

The main features added by this release are:

3.1. The ceph-ansible Utility

Ansible now configures firewalld by default

The ceph-ansible utility now configures the firewalld service by default when creating a new cluster. Previously, it only checked if required ports were opened or closed, but it did not configure any firewall rules.

Pool size can now be customized when deploying clusters with ceph-ansible

Previously, the ceph-ansible utility set the pool size to 3 by default and did not allow the user to change it. However, in Red Hat OpenStack deployments, setting the size of each pool is sometimes required. With this update, the pool size can be customized. To do so, change the size setting in the all.yml file. Each time, the value of size is changed, a new size is applied.

Ansible now validates CHAP settings before running playbooks

Previously, when the Challenge Handshake Authentication Protocol (CHAP) settings were set incorrectly, the ceph-ansible utility returned an unclear error message during deploying Ceph iSCSI gateway. With this update, ceph-ansible validates the CHAP settings before deploying Ceph iSCSI gateways.

The noup flag is now set before creating OSDs to distribute PGs properly

The ceph-ansible utility now sets the noup flag before creating OSDs to prevent them from changing their status to up before all OSDs are created. Previously, if the flag was not set, placement groups (PGs) were created on only one OSD and got stuck in creation or activation. With this update, the noup flag is set before creating OSDs and unset after the creation is complete. As a result, PGs are distributed properly among all OSDs.

Variables are now validated at the beginning of an invocation of ceph-ansible playbooks

The ceph-ansible utility now validates variables specified in configuration files located in the group_vars or host_vars directories at the beginning of playbooks invocation. This change makes it easier to discover misconfigured variables.

Ceph Ansible supports a mulit-site Ceph Object Gateway configuration

With previous versions of ceph-ansible, only one Object Gateway endpoint was configurable. With this release, ceph-ansible supports a multi-site Ceph Object Gateway for multiple endpoints. Zones can be configured with multiple Object Gateways and can be added to a zone automatically by appending their endpoint information to a list. With the rgw_multisite_proto option, users can set it to http or https depending on whether the endpoint is configured to use SSL or not.

When more than one Ceph Object Gateway is in the master zone or in the secondary zone, then the rgw_multisite_endpoints option needs to be set. The rgw_multisite_endpoints option is a comma separated list, with no spaces. For example:

rgw_multisite_endpoints: http://foo.example.com:8080,http://bar.example.com:8080,http://baz.example.com:8080

When adding a new Object Gateway, append it to the end of the rgw_multisite_endpoints list with the endpoint URL of the new Object Gateway before running the Ansible playbook.

Ansible now has the ability to start OSD containers using numactl

With this update, the ceph-ansible utility has the ability to start OSD containers using the numactl utility. numactl allows use of the --preferred option, which means the program can allocate memory outside of the NUMA socket and running out of memory causes less problems.

3.2. Ceph File System

A new subcommand: drop_cache

The ceph tell command now supports the drop_cache subcommand. Use this subcommand to drop Metadata Server (MDS) cache without restarting, trim its journal, and ask clients to drop all capabilities that are not in use.

New option: mds_cap_revoke_eviction_timeout

This update adds a new configurable timeout for evicting clients that have not responded to capability revoke request by the Metadata Server (MDS). MDS can request clients to release its capabilities under certain conditions, such as another client requesting a capability that is currently held by a client. The client then releases its capabilities and acknowledges the MDS which can handover the capability to other clients. However, a misbehaving client might not acknowledge or could totally ignore the capability revoke request by the MDS, causing other clients to wait and thereby stalling requested I/O operations. Now, MDS can evict clients that have not responded to capability revoke requests for a configured timeout. This is disabled by default and can be enabled by setting the mds_cap_revoke_eviction_timeout configuration parameter.

SELinux support for CephFS

This update adds the SELinux policy for the Metadata Server (MDS) and ceph-fuse daemons so that users can use Ceph File System (CephFS) with SELinux in enforcing mode.

MDS now records the IP address and source port for evicted clients

The Red Hat Ceph Storage Metadata Server (MDS) now logs the IP address and source port for evicted clients. If you want to correlate client evictions with machines, review the cluster log for this information.

Improved logging for Ceph MDS

Now, the Ceph MetaData Server (MDS) outputs more metrics concerning client sessions by default to the debug log. This includes the creation of the client session and other metadata. This information is useful for storage administrators to see when a new client session is created and how long it took to establish a connection.

session_timeout and session_autoclose are now configurable by ceph fs set

You can now configure the session_timeout and session_autoclose options by using the ceph fs set command instead of setting them in the Ceph configuration file.

3.3. The ceph-volume Utility

Specifying more than one OSD per device is now possible

With this version, a new batch subcommand has been added. The batch subcommand includes the --osds-per-device option that allows specifying multiple OSD per device. This is especially useful when using high-speed devices, such as Non-volatile Memory Express (NVMe).

New subcommand: `ceph-volume lvm batch'

This update adds the ceph-volume lvm batch subcommand that allows creation of volume groups and logical volumes for OSD provisioning from raw disks. The batch subcommand makes creating logical volumes easier for users who are not familiar with the Logical Volume Manager (LVM). With batch, one or many OSDs can be created by passing an array of devices and an OSD count per device to the ceph-volume lvm batch command.

3.4. Containers

Support the iSCSI gateway in containers

Previously, the iSCSI gateway could not be run in a container. With this update to Red Hat Ceph Storage, a containerized version of the Ceph iSCSI gateway can be deployed with a containerized Ceph cluster.

3.5. Distribution

nfs-ganesha rebased to 2.7

The nfs-ganesha package has been upgraded to upstream version 2.7, which provides a number of bug fixes and enhancements over the previous version.

3.6. iSCSI Gateway

Target-level control parameters can be now overridden

Only if instructed to by Red Hat Support, the following configuration settings can now be overridden by using the gwcli reconfigure subcommand:

  • cmdsn_depth
  • immediate_data
  • initial_r2t
  • max_outstanding_r2t
  • first_burst_length
  • max_burst_length
  • max_recv_data_segment_length
  • max_xmit_data_segment_length

Tuning these variables might be useful for high IOPS/throughput environments. Only use these variables if instructed to by Red Hat Support

Automatic rotation of iSCSI logs

This update implements automatic log rotation for the rbd-target-gw, rbd-target-api, and tcmu-runner daemons that are used by Ceph iSCSI gateways.

3.7. Object Gateway

Changed the reshard_status output

Previously, the radogw-admin reshard status --bucket <bucket_name> command displayed a numerical value for the reshard_status output. These numerical values corresponded with an actual status, as follows:

CLS_RGW_RESHARD_NONE        = 0
CLS_RGW_RESHARD_IN_PROGRESS = 1
CLS_RGW_RESHARD_DONE        = 2

In this release, these numerical values were replaced by the actual status.

3.8. Object Gateway Multisite

New performance counters added

This update adds the following performance counters to multi-site configuration of the Ceph Object Gateway to measure data sync:

  • poll_latency measures the latency of requests for remote replication logs.
  • fetch_bytes measures the number of objects and bytes fetched by data sync.

3.9. Packages

ceph rebased to 12.2.8

The ceph package has been upgraded to upstream version 12.2.8, which provides a number of bug fixes and enhancements over the previous version.

3.10. RADOS

OSD BlueStore is now fully supported

BlueStore is a new back end for the OSD daemons that allows for storing objects directly on the block devices. Because BlueStore does not need any file system interface, it improves performance of Ceph Storage Clusters.

To learn more about the BlueStore OSD back end, see the OSD BlueStore chapter in the Administration Guide for Red Hat Ceph Storage 3.

New option: osd_scrub_max_preemptions

With this release a new osd_scrub_max_preemptions option has been added. This option sets the maximum number of times Ceph preempts a deep scrub due to a client operation before blocking the client I/O to complete the scrubbing process. The option is set to 5 by default.

Offline splitting FileStore directories to a target hash level is now supported

The ceph-objectstore-tool utility now supports splitting FileStore directories to a target hash level.

New option: osd_memory_target

A new option, osd_memory_target, has been added with the release. This option sets a target memory size for OSDs. The BlueStore back end adjusts its cache size and attempts to stay close to this target. The ceph-ansible utility automatically adjusts osd_memory_target based on host memory. The default value is 4 GiB. The osd_memory_target option is set differently for Hyper-converged infrastructure (HCI) and non-HCI setups. To differentiate between them, use the is_hci configuration parameter. This parameter is set to false by default. To change the default values of osd_memory_target and is_hci, set them in the all.yml file.

New options: osd_delete_sleep, osd_remove_threads, and osd_recovery_threads

This update adds a new configuration option, osd_delete_sleep to throttle object delete operations. In addition, the osd_disk_threads option has been replaced with the osd_remove_threads and osd_recovery_threads options so that users can separately configure the threads for these tasks. These changes help to throttle the rate of object delete operations to reduce the impact on client operations. This is especially important when migrating placement groups (PGs). When using these options, every removal thread sleeps for the number of seconds specified between small batches of removal operations.

Upgrading to the latest version no longer causes cluster data movement

Previously, upgrading a Red Hat Ceph Storage cluster to the latest version when CRUSH device classes were enabled, the crushtool utility rebalanced data in the cluster because of changes in the CRUSH map. This data movement should not have occurred. With this update, a reclassify functionality is available to help transition from older CRUSH maps that maintains parallel hierarchies for OSDs of different types to a modern CRUSHmap that makes use of the device class feature without triggering data movement.

3.11. Block Devices (RBD)

Support for RBD mirroring to multiple secondary clusters

Mirroring RADOS Block Devices (RBD) from one primary cluster to multiple secondary clusters is now fully supported.

rbd ls now uses IEC units

The rbd ls command now uses International Electrotechnical Commission (IEC) units to display image sizes.