Chapter 4. Bug fixes

This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.

4.1. The ceph-ansible Utility

It is now possible to use Ansible playbooks without copying them to the root ceph-ansible directory

Due to the missing library variable in the Ansible configuration, the custom Ansible modules were not detected when executed playbooks were present in the infrastructure-playbooks directory. Consequently, it was not possible to run the infrastructure playbooks without copying them into the root ceph-ansible directory. This update adds the library variable to the Ansible configuration. As a result, it is possible to use playbooks in the infrastructure-playbooks without copying them, for example:

# ansible-playbook infrastructure-playbooks/purge-cluster.yml -i inventory_file

(BZ#1668478)

The purge-cluster.yml playbook no longer fails when initiated a second time

The purge-cluster.yml playbook would fail if the ceph-volume binary was not present. Now the presence of the ceph-volume binary is checked, allowing for the purge-cluster.yml playbook to be initiated multiple times successfully.

(BZ#1722663)

An increase to the CPU allocation for containerized Ceph MDS deployments

Previously, for container-based deployments, the CPU allocation for the Ceph MDS daemons was set to 1 as the default. In some scenarios, this caused slow performance when compared to a bare-metal deployment. With this release, the Ceph MDS daemon CPU allocation default is 4.

(BZ#1695850)

Redeploying OSDs using the same device name works as expected

Previously, the shrink-osd.yml playbook did not remove containers generated as part of the prepare containers task that were launched during initial development. As a consequence, an attempt to redeploy a container using the same device name failed, because the container was already present. The shrink-osd.yml playbook now properly removes containers generated as part of the prepare containers task, and redeploying OSDs using the same device name works as expected.

(BZ#1728132)

The BlueStore WAL and DB partitions are now only created when dedicated devices are specified for them

Previously, in containerized deployments using the non-collocated scenario, the BlueStore WAL partition was created by default on the same device as the BlueStore DB partition when it was not required. With this update, the bluestore_wal_devices variable is no longer set to dedicated_devices by default, and the BlueStore WAL partition is no longer created on the BlueStore DB device.

(BZ#1685253)

Ceph Ansible can configure RBD mirroring as expected

Previously, the configuration of RADOS Block Device (RBD) mirroring was incomplete and only available for non-containerized deployment. Consequently, the ceph-ansible utility was unable to configure RBD mirroring properly. The RBD mirroring configuration has been improved and has added support for containerized deployments. As a result, ceph-ansible can now configure the mirror pool mode and add the remote peer as expected on both deployments.

(BZ#1665877)

The ceph-handler script no longer restarts all OSDs regardless of if the limit parameter is provided

Previously, the ceph-handler script executed on all OSD nodes even if the ceph-ansible limit parameter was provided. This meant all OSDs were restarted, ignoring the limit parameter. With this update, the ceph-handler script only targets the OSD nodes included by the limit parameter, and the OSDs are restarted properly according to the ceph-ansible limit parameter.

(BZ#1535960)

Ansible first completes configure_iscsi.yml tasks and then starts the daemons

Previously, during a rolling update, the ceph-ansible utility started the Ceph iSCSI daemons and ran the configure_iscsi.yml playbook in parallel. Consequently, the daemon operations could conflict with the configure_iscsi.yml tasks that set up objects and the system could terminate unexpectedly due to the kernel being in an unsupported state. With this update, ceph-ansible first completes the ` configure_iscsi.yml` tasks of creating iSCSI targets and then starts the daemons to avoid potential conflicts.

(BZ#1795806)

Using custom repositories to install Red Hat Ceph Storage

Previously, using custom software repositories to install Ceph was disabled. Having a custom software repository can be useful for environments where Internet access is not allowed. With this release, the ability to use custom software repositories are enabled for Red Hat signed packages only. Custom third-party software repositories are not supported.

(BZ#1673254)

Ceph Ansible can now successfully activate OSDs that use NVMe devices

Due to an incorrect parsing of Non-volatile Memory Express (NVMe) drives, the ceph-ansible utility could not activate an OSD that used NVMe devices. This update fixes the parsing of the NVMe drives, and ceph-ansible can now successfully activate OSDs that use NVMe devices.

(BZ#1523464)

Rolling update works as expected.

Previously, when using rolling_update.yml to update the Red Hat Ceph Storage cluster, it could fail due to a Python module import failure. The error printed was ERROR! Unexpected Exception, this is probably a bug: cannot import name to_bytes. With this update, the correct import is used and no error occurs.

(BZ#1598763)

ceph-ansible now reports an error if an unsupported Ansible version is used

The ceph-ansible utility supports only Ansible versions from 2.3.x to 2.4.x. Previously, when Ansible version was higher than 2.4.x, the installation process failed with an error. With this update, ceph-ansible checks the Ansible version and reports an error if an unsupported Ansible version is used.

(BZ#1631563)

ceph_release in no longer automatically being reset to ceph_stable_release when ceph_repository is set to rhcs

Previously, ceph_release was automatically being reset to ceph_stable_release even when ceph_repository was set to rhcs in the all.yml file. ceph_stable_release is not needed when using the rhcs repository, and was being set to the automatic default value dummy. This caused the allow multi mds task to fail with the error has no attribute because ceph_release_num has no key dummy. With this update, ceph_release is no longer reset when ceph_repository is set to rhcs, and the task allow multi mds can be executed properly.

(BZ#1765230)

The shrink-osd.yml playbook removes partitions from NVMe disks in all situations

Previously, the Ansible playbook infrastructure-playbooks/shrink-osd.yml did not properly remove partitions on NVMe devices when used with the osd_scenario: non-collocated option in containerized environments. This bug has been fixed with this update, and the playbook removes the partitions as expected.

(BZ#1572933)

The ceph -w process is no longer running after canceling the command

The ceph aliases were not using the interactive session options for the docker commands in {storage-product} container environments. This was leaving a running ceph process, which was waiting on the user’s input. With this release, the interactive session options, -it, have been added to the docker commands being referenced by the ceph aliases.

(BZ#1797874)

The shrink-osd.yml playbook stops OSD services as expected

A bug in the shrink-osd.yml playbook caused the stopping osd service task to attempt to connect to an incorrect node. Consequently, the task could not stop the OSD services properly. With this update, the bug has been fixed, and the playbook delegates the task on the correct node. As a result, OSD services are stopped properly.

(BZ#1686306)

Adding a new Ceph Manager node will no longer fail when using the Ansible limit option

Previously, adding a new Ceph Manager to an existing storage cluster when using the limit option would fail the Ansible playbook. With this release, you can now use the limit option when adding a new Ceph Manager and the newly generated keyring to be copied successfully.

(BZ#1552210)

The radosgw_address variable can be set to 0.0.0.0

Previously, the default value for radosgw_address was 0.0.0.0. If you did not change the default value from 0.0.0.0, then ceph-ansible would fail validation. However, this is a valid value for RADOS Gateway. With this update to Red Hat Ceph Storage, the default value was changed to x.x.x.x, so you can change the value to 0.0.0.0 and it will pass validation.

(BZ#1782743)

The ceph-ansible playbooks are no longer missing certain tags

Previously, the ceph-ansible playbooks were missing some tags, so running ceph-ansible with those specific tags was failing. With this update, the Ceph roles are tagged correctly in the ceph-ansible playbooks, and running ceph-ansible with those specific tags works as expected.

(BZ#1754432)

The group_vars files now correctly refer to RHCS 3.x instead of 2.x

Previously, the Red Hat Ceph Storage (RHCS) documentation URL and default value were referring to RHCS 2.x instead of 3.x. This meant deploying with the default value on baremetal using the CDN repositories would configure RHCS 2.x repositories instead of 3.x. The documentation in the configuration files were also referring to 2.x. With this update, the default RHCS version value and URL are now referring to RHCS 3, and there are no 2.x references.

(BZ#1702732)

  • The playbook cannot shrink OSDs when using FQDN in inventory

When using FQDN in inventory, some tasks use the shortname returned by the Ceph OSD tree to add data to hostvars[]. This renders the playbook unable to shrink the OSDs.

The fix for this issue directs the tasks to use the inventory_hostname instead of the shortname.

(BZ#1779021)

Ansible removes the chronyd service after Ceph installation

The chronyd service in another implementation of the Network Time Protocol (NTP) and was enabled after rebooting from the initial installation. With this release, the chronyd service is disabled and the default NTP service is enabled.

(BZ#1651875)

Virtual IPv6 addresses are no longer configured for MON and RGW daemons

Previously, virtual IPv6 addresses could be configured in the Ceph configuration file for MON and RGW daemons because virtual IPv6 addresses are the first value present in the Ansible IPv6 address fact. The underlying code has been changed, and the last value in the Ansible IPv6 address fact is now used, and MON and RGW IPv6 configurations are set to the right value.

(BZ#1680155)

Faster OSD creation when deploying on containers

Previously, when creating an OSD in a container using the lvm OSD scenario, the container was allowed to set the number of open files to a value higher than the default host value. This behavior caused slower ceph-volume performance when compared to running ceph-volume on bare metal. With this release, the maximum number of open files is set to a lower value (1024) on the container during OSD creation. This results in faster OSD creation in container-based deployment.

(BZ#1702285)

The rolling_update.yml playbook now restarts tcmu-runner and rbd-target-api

Previously, the iSCSI gateway infrastructure playbooks, specifically rolling_update.yml, only restarted the rbd-target-gw daemon. With this update, the playbook also restarts the tcmu-runner and rbd-target-api daemons so the updated versions of those daemons are used.

(BZ#1659611)

Ceph Ansible can now successfully updates and restarts NFS Ganesha container when a custom suffix is used for the container name

Previously, the value set for the ceph_nfs_service_suffix variable was not considered when checking the status and version of the Ceph NFS Ganesha (ceph-nfs) container for restart or update. Consequently, the ceph-nfs container was not updated or restarted because the ceph-ansible utility could not determine that the container was running. With this update, ceph-ansible uses the value of ceph_nfs_service_suffix to determine the status of the ceph-nfs container. As a result, ceph-nfs container is successfully updated or restarted as expected.

(BZ#1750005)

The purge-cluster.yml playbook no longer causes issues with redeploying a cluster

Previously the purge-cluster.yml Ansible playbook did not clean all Red Hat Ceph Storage kernel threads as it should and could leave CephFS mountpoint mounted and Ceph Block Devices mapped. This could prevent redeploying a cluster. With this update, the purge-cluster.yml Ansible playbook cleans all Ceph kernel threads, unmounts all Ceph related mountpoint on client nodes, and unmaps Ceph Block Devices so the cluster can be redeployed.

(BZ#1337915)

Upgrading OSDs is no longer unresponsive for a long period of time

Previously, when using the rolling_update.yml playbook to upgrade an OSD, the playbook waited for the active+clean state. When data and no of retry count was large, the upgrading process became unresponsive for a long period of time because the playbook set the noout and norebalance flags instead of the nodeep-scrub flag. With this update, the playbook sets the correct flag, and the upgrading process is no longer unresponsive for a long period of time.

(BZ#1740463)

Ansible now enables the fragmentation flag when upgrading from Red Hat Ceph Storage 2 to 3

Previously, when upgrading from Red Hat Ceph Storage 2 to 3, the Ceph File System (CephFS) directories were not fragmented. With this update, the ceph-ansible utility enables the allow_dirflag flag that allows fragmentation during the upgrade.

(BZ#1776233)

Deploying NFS Ganesha gateway on Ubuntu IPv6 systems works as expected

When deploying NFS Ganesha gateway on Ubuntu IPv6 systems, the ceph-ansible utility failed to start the nfs-ganesha service. As a consequence, the installation process failed as well. This bug has been fixed, and the installation process proceeds as expected.

(BZ#1656908)

The ceph-volume execution time has been adjusted

On containerized deployment, the ceph-volume commands that were executed inside the OSD containers were taking more time than expected. Consequently, the OSD daemon could take several minutes to start because ceph-volume was executed before the ceph-osd process. The value of the ulimit nofile variable has been adjusted on the OSD container process to reduce the execution time of the ceph-volume commands. As a result, the OSD daemon starts faster.

(BZ#1744390)

The value of osd_memory_target for HCI deployment is calculated properly

Previously, the calculation of the number of OSDs was not implemented for containerized deployment; the default value was 0. Consequently, the calculation of the value of the BlueStore osd_memory_target option for Hyper-converged infrastructure (HCI) deployment was not correct. With this update, the number of OSDs is reported correctly for containerized deployment, and the value of osd_memory_target for the HCI configuration is calculated properly.

(BZ#1664112)

4.2. Ceph Management Dashboard

Alerts are sent to the Dashboard when the cluster status changes from HEALTH_WARN to HEALTH_ERR

Previously, when the cluster status changed from HEALTH_WARN to HEALTH_ERR, no alert was sent to the Dashboard Alert tab. With this update, sending alerts works as expected in the described scenario.

(BZ#1609381)

The Prometheus exporter port is now opened on all ceph-mgr nodes

Previously, the ceph-mgr playbook was not run on each ceph-mgr node, which meant the ceph-mgr Prometheus exporter port was not being opened on each node. With this update, the ceph-mgr playbook runs on all the ceph-mgr nodes, and the Prometheus exporter port is opened on all ceph-mgr nodes.

(BZ#1744549)

The dashboard can now be configured in a containerized cluster

Previously, in a containerized Ceph environment, the Red Hat Ceph Storage dashboard failed because the cephmetric-ansible playbook failed to populate the container name. With this update, the playbook populates the container name, and the dashboard can be configured as expected.

(BZ#1731919)

The MDS Performance dashboard now displays the correct number of CephFS clients

The MDS Performance dashboard displayed an incorrect value for Clients after increasing and decreasing the number of active Metadata Servers (MDS) and clients multiple times. This bug has been fixed, and the MDS Performance dashboard now displays the correct number of Ceph File System (CephFS) clients as expected.

(BZ#1652896)

The TCP port for the Ceph exporter is opened during the Ansible deployment of the Ceph Dashboard

Previously, the TCP port for the Ceph exporter was not opened by the Ansible deployment scripts on all the nodes in the storage cluster. Opening TCP port 9283 had to be done manually on all nodes for the metrics to be available to the Ceph Dashboard. With this release, the TCP port is now being opened by the Ansible deployment scripts for Ceph Dashboard.

(BZ#1677269)

The Red Hat Ceph Storage Dashboard includes information for Disk IOPS and Disk Throughput as expected

The Red Hat Ceph Storage Dashboard did not show any data for Disk IOPS and Disk Throughput. This bug has been fixed, and the Dashboard includes information for Disk IOPS and Disk Throughput as expected.

(BZ#1753942)

No data alerts are no longer generated

The Red Hat Ceph Storage Dashboard generated a No data alert when a query returns no data. Previously, this alert sent an email to the administrator whenever there was a network outage or a node was down for maintenance. With this update, these No data alert are no longer generated.

(BZ#1663289)

4.3. Ceph File System

The drop cache command completes as expected

Previously, when executing the administrative drop cache command, the Metadata Server (MDS) did not detect that the clients could not return more capabilities, and the command would not complete. With this update, the MDS now detects the clients cannot return any more capabilities, and the command completes.

(BZ#1685734)

The MDS no longer tries many log segments after restart

Previously, the Ceph Metadata Server (MDS) would sometimes try many log segments after restart. The MDS would then send too many OSD requests in a short period of time which could harm the Ceph cluster. This update limits the number of log segments, and the cluster is no longer harmed.

(BZ#1714814)

An issue with the _lookup_parent() function no longer causes nfs-ganesha to fail

Under certain circumstances, the _lookup_parent() function in the Red Hat Ceph Storage userland client libraries could return 0, but not zero out the parent return pointer, which would remain uninitialized. Later, an assertion that the parent pointer be NULL would trip, and cause nfs-ganesha to fail. With this update, the error checking and return of _lookup_parent() has been refactored, and the situation is avoided.

(BZ#1715086)

A new ASOK command prevents server outages caused by client eviction

This release introduces a new ceph-daemon mds.x session config <client_id> timeout <seconds> ASOK command. Use this command to configure a timeout for individual clients to prevent or delay the client from getting evicted. This is especially useful to prevent server outages caused by client evictions.

(BZ#1729353)

Partially flushed ESessions log event no longer cause the MDS to fail

Previously, when a Ceph Metadata Server (MDS) had more than 1024 client sessions, sessions in the ESessions log event could get flushed partially. The journal replay code expects sessions in the ESessions log event to either be all flushed or not flushed at all, so this would cause the MDS to fail. With this update, the journal replay code can handle a partially flushed ESessions log event.

(BZ#1718135)

Heartbeat packets are reset as expected

Previously, the Ceph Metadata Server (MDS) did not reset heartbeat packets when it was busy in a large loops. This prevented the MDS from sending a beacon to the Monitor. With this update, the Monitor replaces the busy MDS, and the heartbeat packets are reset when the MDS is busy in a large loop.

(BZ#1714810)

4.4. Ceph Manager Plugins

The RESTful API /osd endpoint returns the full list of OSDs

Previously, the OSD traversal algorithm incorrectly handled data structures. As consequence, an internal server error was returned when listing OSDs by using the RESTful API /osd endpoint. With this update, the algorithm now properly traverses the OSD map, and the /osd endpoint returns the full list of OSDs as expected.

(BZ#1764919)

Using several ceph-mgr modules at the same time no longer causes random segmentation faults

Previously, random segmentation faults of the ceph-mgr daemon were occurring. This was because the shared memory in ceph-mgr Python modules was being accessed without proper locks, and the memory was not being dereferenced properly. For these cases, the locking mechanisms in ceph-mgr has been improved, and random segmentation faults when using several ceph-mgr modules at the same time no longer occurs.

(BZ#1717199)

Ceph-balancer status requests respond immediately

Previously, the status requests could become unresponsive due to CPU bound balance calculation. With this update, locks are released when they are not needed, and the CPU bound balance calculation has been fixed. As a result, status requests respond immediately.

(BZ#1761839)

4.5. The ceph-volume Utility

ceph-volume now returns a more accurate error message when deploying OSDs on devices with GPT headers

The ceph-volume utility does not support deploying OSDs on devices with GUID Partition Table (GPT) headers. Previously, after attempting to do so, an error similar to the following one was returned:

Device /dev/sdb excluded by a filter

With this update, the ceph-volume utility returns a more accurate error message instructing the users to remove GPT headers:

GPT headers found, they must be removed on: $device_name

(BZ#1644321)

ceph-volume can determine if a device is rotational or not even if the device is not in the /sys/block/ directory

If the device name did not exist in the /sys/block/ directory, the ceph-volume utility could not acquire information on if a device was rotational or not. This was for example the case for loopback devices or devices listed in the /dev/disk/by-path/ directory. Consequently, the lvm batch subcommand failed. with this update, ceph-volume uses the lsblk command to determine if a device is rotational if no information is found in /sys/block/ for the given device. As a result, lvm batch works as expected in this case.

(BZ#1666822)

An error is now returned when the WAL and DB partitions are defined but not present

Due to a race condition, after restarting a Nonvolatile Memory Express (NVMe) device containing the WAL and DB devices, the symbolic links for WAL and DB were missing. Consequently, the NVMe node could not be mounted. The underlying source code has been modified to return an error if WAL or DB devices are defined but the symbolic links are missing on the system, which allows trying for up to 30 times at 5 second intervals and increasing the chances of finding the devices as the system boots.

(BZ#1719971)

4.6. iSCSI Gateway

The Ceph iSCSI gateway no longer fails to start when an RBD image cannot be found in a pool

During initialization, the rbd-target-gw daemon configures RBD images for use with the Ceph iSCSI gateway. The rbd-target-gw daemon did only a partial pool name match, potentially causing the incorrect pool to be used when opening an RBD image. As a consequence, the rbd-target-gw daemon failed to start. With this release, the rbd-target-gw daemon does a full pool name match, and the rbd-target-gw daemon starts as expected.

(BZ#1719772)

The rbd-target-gw service no longer fails to start when there are expired blacklist entries

When the rbd-target-gw service starts, it removes blacklist entries for the node. Previously, if a blacklist entry expired at the same time the daemon was removing it, the rbd-target-gw service would fail to detect the race and fail to start up. With this update, the rbd-target-gw service now checks for the error code indicating the blacklist entry no longer exists, ignores the error, and starts as expected.

(BZ#1732393)

Synchronization between ceph-ansible and the ceph-iscsi daemon

Prior to this update, when using the ceph-ansible utility, the python-rtslib back end device cache used by the ceph-iscsi daemons, and the kernel could become out of sync. Consequently, the ceph-ansible and ceph-iscsi daemon operations failed, and the daemons terminated unexpectedly. With this update, the ceph-iscsi operations executed by ceph-ansible and the daemons that access the cache are forced to be updated. As a result, the daemons no longer fail in the described scenario.

(BZ#1785288)

The rbd-target-api service is started and stopped with respect to the rbd-target-gw service status

Previously, the rbd-target-api service did not start after starting the rbd-target-gw service. Consequently, the rolling_update.yml playbook stopped at TASK [stop ceph iscsi services], and the updating process did not continue. With this update, the rbd-target-api service is started and stopped with respect to the rbd-target-gw service status, and the updating process works as expected.

(BZ#1670785)

4.7. Object Gateway

A performance decrease when listing buckets with large object counts due to a regression was resolved

RADOS Gateway introduced a peformance regression as a byproduct of changes in Red Hat Ceph Storage 3.2z2, which added support for multicharacter delimiters. This could cause S3 clients to time out. The regression has been fixed, restoring the original performance when listing buckets with large object counts. S3 clients no longer time out due to this issue.

(BZ#1717135)

Removing non-existent buckets from the reshard queue works as expected

When a bucket was added to the reshard queue and then it was deleted, an attempt to remove the bucket from the queue failed because the removal process tried to modify the bucket record, which did not exist. Additionally, during reshard processing, when a non-existent bucket was encountered on the queue, the reshard process stopped early and possibly never got to other buckets on the queue. This behavior kept happening because the reshard process is scheduled to run at a specified time interval. The underlying source code has been modified, and removing non-existent buckets from the reshard queue works as expected.

(BZ#1749124)

Ability to cancel resharding of tenanted bucktes

Previously, it was not possible to cancel the resharding process of a tenanted bucket because the radosgw-admin reshard cancel did not support this scenario. With this update, a new --tenant option has been added, and it is now possible to cancel resharding of the tenanted buckets as expected.

(BZ#1756149)

The S3 client no longer times out when listing buckets with millions of objects

Previously, a change to the behavior of ordered bucket listing allowed support for multi-character delimiter searching, but this change did not include important listing optimizations. This caused a large performance loss. With this release, the logic controlling delimiter handling has been optimized, resulting in better performance.

(BZ#1718328)

Multi-character delimiter searches now take an expected amount of time to complete

Sometimes multi-character delimiter searches took an excessive amount of time. The logic has been corrected and now searches take an expected amount of time.

(BZ#1720741)

Getting the versioning state on a nonexistent bucket now returns an error

Previously, when getting the bucket version on a nonexistent bucket, the HTTP response was successful, for example:

'HTTPStatusCode': 200

Because the bucket does not exist, the correct HTTP response must be an error. With this release, when getting the bucket version on a nonexistent bucket, the Ceph Object Gateway code returns the following error:

ERR_NO_SUCH_BUCKET

(BZ#1705922)

The RADOS configuration URL is now able to read objects larger than 1000 bytes

The RADOS configuration URL was unable to read configuration objects greater than 1000 bytes because they were truncated. This behavior has been fixed and now larger objects are read properly.

(BZ#1725521)

All visible bucket index entries are listed as expected

Previously, interaction of legacy filtering rules with new sharded listing optimizations in the Object Gateway bucket listing code was incorrect. As a consequence, bucket listings could skip some of or even all entries in a bucket with a sharded index when multiple filtered entries, such as uncompleted multipart uploads, were present in the index. The iteration and filtering logic has been fixed, and all visible bucket index entries are listed as expected.

(BZ#1778217)

Swift object expiration is no longer effected by resharding

The Swift object expiration code was not compatible with bucket index resharding. This behavior could stall object expiration for the buckets. The Swift object expiration code has been updated to identify buckets using a tenant and bucket name. This update allows the removal of expired objects from an already resharded and stalled bucket. As a result, the object expiration is no longer effected by bucket index resharding.

(BZ#1703557)

Large or changed directories are now handled properly

Due to several underlying problems in the Ceph Object Gateway, the listing of very large directories could fail, and changed directories could become stale. With this update, the underlying problems have been fixed, allowing listing of large directories without failures, and reliable expiration of cached directory contents. Additionally, for the RADOS Gateway NFS interface, further changes were made allowing large directories to be listed at least 10 times faster than in Red Hat Ceph Storage 2.x.

(BZ#1708587)

Dynamic bucket index resharding no longer uses unnecessarily high system resources

Previously, during bucket index sharding, the code built a large JSON object even if it was not needed. During bucket listing, the Ceph Object Gateway requested too many entries from each bucket index. This behavior caused high CPU, memory, and network usage. Together, this caused the time for resharding to complete to be unnecessarily long. With this release, the large JSON object is only built if required and dynamic bucket index resharding only shards up to 2000 entries at a time. The default maximum can be overridden using a configuration option. With these changes Red Hat Ceph Storage uses less memory during resharding and ordered bucket listing is more efficient so it takes less time.

(BZ#1753588)

A new bucket life-cycle policy will overwrite the existing life-cycle policy

Because of an encoding error with the Ceph Object Gateway, storing a new bucket life-cycle policy on a bucket that already had an existing one would fail. Previously, working around the failure was done by deleting the old policy first, before storing the new one. With this release, this encoding error was fixed.

(BZ#1708650)

Enabling the rgw_enable_ops_log option would result in unbound memory growth

Previously, there was no process for consuming log entries, which lead to unbound memory growth for the Ceph Object Gateway. With this release, the process discards new messages when the number of outstanding messages in the data buffer exceeds a threshold, resulting in a smaller memory footprint.

(BZ#1708346)

Enabling the enable_experimental_unrecoverable_data_corrupting_features flag is no longer required when using the Beast web server

To use the Beast web server, it was required to enable the enable_experimental_unrecoverable_data_corrupting_features flag even though Beast was fully supported and not a Technology Preview anymore. With this update, enabling enable_experimental_unrecoverable_data_corrupting_features is no longer required to use Beast.

(BZ#1749754)

Space is no longer leaked when deleting objects via NFS

Previously, the Ceph Object Gateway NFS implementation incorrectly set a value used to construct a key subsequently used to set garbage collection (GC) on shadow objects. Deleting an object via NFS, as opposed to S3 or Swift, could cause space to be leaked. With this update, the GC tag is now set correctly and space is not leaked when deleting objects via NFS.

(BZ#1715946)

Ceph Object Gateway daemons no longer crash after upgrading to the latest version

Latest update to Red Hat Ceph Storage introduces a bug that caused Ceph Object Gateway daemons to terminate unexpectedly with a segmentation fault after upgrading to the latest version. The underlying source code has been fixed, and Ceph Object Gateway daemons work as expected after the upgrade.

(BZ#1766448)

Entries are now placed on the correct bucket index shard

Previously, certain objects in the sharded bucket index were in an incorrect shard because their hash source was set incorrectly. Consequently, entries could not be found when the correct shard was consulted in the sharded bucket index. The hash source has been set correctly for such objects, and entries are now placed on the correct bucket index shard as expected.

(BZ#1766731)

Different life-cycle rules for different objects no longer display the same rule applied to all objects

The S3 life-cycle expiration tags are a key-value pair, such that a valid match must match both the key and the value. However, the Ceph Object Gateway only matched the key when computing the x-amz-expiration headers, causing tag rules with a common key, but different values, to match incorrectly. With this release, the key and value are both checked when matching a tag rules in the expiration header computation. As a result, objects are displayed with the correct tag rules.

(BZ#1731486)

Swift requests no longer cause the "HTTP/1.1 401 Unauthorized" error

Certain Swift requests with headers that contained non-strictly-compliant HTTP 1.1 line termination character in the "X-Auth-Token:" line were rejected with the "HTTP/1.1 401 Unauthorized" error. On Red Hat Ceph Storage version 2.5 those requests were processed despite their non-compliance. After upgrade to version 3.3 those requests began to return an error. With this update, the non-compliant line termination characters have been removed from the HTTP headers, and the aforementioned Swift requests no longer cause errors.

(BZ#1742993)

Bucket creation failed with a non-default location constraint

The default value is not set for the zone api_name option. This was causing the default zone group name to not be added properly, even when explicitly defining the zone group name. As a consequence, buckets could not be created with a non-default location constraint when referencing a non-default placement target. In this release, buckets can be created with a non-default location constraint when referencing a non-default placement target.

(BZ#1744766)

The clean-up process no longer fails after an aborted upload

When a multipart upload was aborted part way through, the clean-up process assumed some artifacts were present. If they were not present, it caused an error and the clean-up process stopped. The logic has been updated so if the artifacts are not present, the clean-up process still continues until it finishes.

(BZ#1722664)

Ceph Object Gateway no longer terminates when there are many open file descriptors

Previously, the Ceph Object Gateway with Beast front end terminated with an uncaught exception if there were many open file descriptors. With this update, the Ceph Object Gateway no longer terminates.

(BZ#1740668)

The Ceph Object Gateway returns the correct error code when accessing a S3 bucket

The Ceph Object Gateway authorization subsystem was changed in a previous release, and the LDAP error code for failed authentication was not updated. Because of this, the incorrect error code of AccessDenied was returned instead of InvalidAccessKeyId when trying to access a S3 bucket with non-existing credentials. With this release, the correct error code is returned when trying to access a S3 bucket with non-existing credentials.

(BZ#1721033)

Removing entries from the reshard log that refer to tenanted buckets is possible

Previously, when using radosgw-admin to remove a bucket from the reshard log, the tenant information was not passed down to the corresponding code, this limited the ability to remove entries from the reshard log that referenced tenanted buckets. With this update to Red Hat Ceph Storage, entries that reference tenanted buckets can be removed as expected.

(BZ#1794429)

Bucket resharding status is now displayed in plain language

Previously, the radosgw-admin reshard status --bucket bucket_name command used identifier-like tokens as follows to display the resharding status of a bucket:

  • CLS_RGW_RESHARD_NONE
  • CLS_RGW_RESHARD_IN_PROGRESS
  • CLS_RGW_RESHARD_DONE

With this update, the command uses plain language to display the status:

  • not-resharding
  • in-progress
  • done

(BZ#1639712)

4.8. Object Gateway Multisite

The radosgw-admin bucket sync status command works for single-direction sync set up

Previously, if a zone was set up with the sync_from_all parameter set to false, the command radosgw-admin bucket sync status reported as “not in sync_from” because the underlying function expected a zone name instead of zone ID that was provided. The underlying source code has been modified, and radosgw-admin bucket sync status works as expected in the described situation.

(BZ#1713798)

Ceph Object Gateway multisite data sync issue is fixed by removing the filtering step from datalog processing

Previously, multisite data sync could get reported as behind one or more shards during a change, causing incorrect filtering of certain duplicate bucket names. With this fix, Ceph Object Gateway removes a filtering step from datalog processing that detects duplicate bucket names, and the data sync proceeds as expected in the described situation.

(BZ#1782508)

Bucket creation time remains consistent between zones in a multisite environment

Previously, a metadata sync in a multisite environment did not always update bucket creation time, and bucket creation times could become inconsistent between zones. With this update, the metadata sync now updates creation time even if the bucket already exists, and bucket creation time remains consistent between zones.

(BZ#1702288)

radosgw-admin bucket rm --bypass-gc now stores timestamps for deletions

Previously, objects deleted with radosgw-admin bucket rm --bypass-gc did not store a timestamp for the deletion. Because of this, data sync did not apply these object deletions on other zones. With this update, proper timestamps are stored for deletions, and bucket rm with --bypass-gc correctly deletes objects on all zones.

(BZ#1599852)

The radosgw-admin bilog trim command now fully trims the bucket index log

Previously, the radosgw-admin bilog trim command only trimmed 1000 entries from the log, because only one OSD request was sent. With this release, the radosgw-admin bilog trim command now sends OSD requests in a loop until the bucket index log is completely trimmed.

(BZ#1713779)

Enhanced log trimming

Previously, the radosgw-admin datalog trim and radosgw-admin mdlog trim commands trimmed only 1000 entries. This was inconvenient when doing extended log trimming. With this update, the aforementioned commands loop until no log records are available to trim.

(BZ#1732101)

Ceph Object Gateway multisite zones of the same objects are in the same order

Previously, writing versions of the same object from different zones of Ceph Object Gateway multisite were in different orders after sync. With this correction, the versions are sorted, and all the zones are in order as expected.

(BZ#1779334)

4.9. RADOS

The Ceph Balancer now works with erasure-coded pools

The maybe_remove_pg_upmaps method is meant to cancel invalid placement group items done by the upmap balancer, but this method incorrectly canceled valid placement group items when using erasure-coded pools. This caused a utilization imbalance on the OSDs. With this release, the maybe_remove_pg_upmaps method is less aggressive and does not invalidate valid placement group items, and as a result, the upmap balancer works with erasure-coded pools.

(BZ#1715577)

ceph osd in any no longer marks permanently removed OSDs as in

Previously, running the ceph osd in any command on a Red Hat Ceph Storage cluster marked all historic OSDs that were once part of the cluster as in. With this update, ceph osd in any no longer marks permanently removed OSDs as in.

(BZ#1696691)

4.10. Block Devices (RBD)

Operations against the RBD object map now utilize significantly less OSD CPU and I/O resources

The RADOS Block Device (RBD) object map support logic within the OSD daemons inefficiently handled object updates for multi-TiB RBD images. As a consequence, for such images, updating the RBD object map led to high CPU usage and unnecessary I/O within the OSDs. With this update, OSDs no longer pre-initialize the in-memory object map prior to reading the object map from disk. Additionally, now OSDs only perform read-modify-writes operations on portions of the object map Cyclic Redundancy Check (CRC) that are potentially affected by the updated state. As a result, operations against the RBD object map now utilize significantly less OSD CPU and I/O resources.

(BZ#1683751)

The Ceph v12 client (luminous) takes significantly longer to export an image to the Ceph cluster than do the Ceph v13 (mimic) or v14 (nautilus) clients on the same cluster

The rbd export command in the v12 client had a read queue depth of 1, which means that the command issued only one read request at a time to the cluster when exporting to STDOUT. The v12 client now supports up to 10 concurrent read requests to the cluster, resulting in a significant increase in speed.

(BZ#1794704)