Chapter 4. Bug fixes

This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.

4.1. The ceph-ansible Utility

osd_scenario: lvm now works when deploying Ceph in containers

Previously, the lvm installation scenario did not work when deploying a Ceph cluster in containers. With this update, the osd_scenario: lvm installation method is supported as expected in this situation.

(BZ#1509230)

The --limit mdss option now creates CephFS pools as expected

Previously, when deploying the Metadata Server (MDS) nodes by using the Ansible and the --limit mdss option, Ansible did not create the Ceph File System (CephFS) pools. This bug has been fixed, and Ansible creates the CephFS pools as expected.

(BZ#1518696)

Ceph Ansible no longer fails if network interface names include dashes

When ceph-ansible makes an inventory of network interfaces if they have a dash (-) in the name the inventory must convert the dashes to undescores (_) in order to use them. In some cases conversion did not occur and Ceph installation failed. With this update to Red Hat Ceph Storage, all dashes in the names of network interfaces are converted in the facts and installation completes successfully.

(BZ#1540881)

Ansible now sets container and service names that correspond with OSD numbers

When containerized Ceph OSDs were deployed with the ceph-ansible utility, the resulting container names and service names of the OSDs did not correspond in any way to the OSD number and were thus difficult to find and use. With this update, ceph-ansible has been improved to set container and service names that correspond with OSD numbers. Note that this change does not affect existing deployed OSDs.

(BZ#1544836)

Expanding clusters deployed with osd_scenario: lvm works

Previously, the ceph-ansible utility could not expand a cluster that was deployed by using the osd_scenario: lvm option. The underlying source code has been modified, and clusters deployed with osd_scenario: lvm can be expanded as expected.

(BZ#1564214)

Ansible now stops and disables the iSCSI gateway services when purging the Ceph iSCSI gateway

Previously, the ceph-ansible utility did not stop and disable the Ceph iSCSI gateway services when using the purge-iscsi-gateways.yml playbook. Consequently, the services had to be stopped manually. The playbook has been improved, and the iSCSI services are now stopped and disabled as expected when purging the iSCSI gateway.

(BZ#1621255)

The values passed into devices in osds.yml are now validated

Previously in the osds.yml of the Ansible playbook, the values passed into the devices parameter were not validated. This caused errors when ceph-disk, parted, or other device preparation tools failed to operate on devices that did not exist. It also caused errors if the number of values passed into the dedicated_devices parameter was not equal to the number of values passed into devices. With this update, the values are validated as expected, and none of the above mentioned errors occur.

(BZ#1648168)

Purging clusters using ceph-ansible deletes logical volumes as expected

When using the ceph-ansible utility to purge a cluster that deployed OSDs with the ceph-volume utility, the logical volumes were not deleted. This behavior caused logical volumes to remain in the system after the purge process completed. This bug has been fixed, and purging clusters using ceph-ansible deletes logical volumes as expected.

(BZ#1653307)

The --limit osds option now works as expected

Previously, an attempt to add OSDs by using the --limit osds option failed on container setup. The underlying source code has been modified, and adding OSDs with --limit osds works as expected.

(BZ#1670663)

Increased CPU CGroup limit for containerized Ceph Object Gateway

The default CPU CGroup limit for containerized Ceph Object Gateway (RGW) was very low and has been increased with this update to be more reasonable for typical Hard Disk Drive (HDD) production environments. However, consider evaluating what limit to set for the site’s configuration and workload. To customize the limit, adjust the ceph_rgw_docker_cpu_limit parameter in the Ansible group_vars/rgws.yml file.

(BZ#1680171)

SSL works as expected with containerized Ceph Object Gateways

Previously, the SSL configuration in containerized Ceph Object Gateways did not work because the Certificate Authority (CA) certificate was only added to the TLS bundle on the hypervisor and was not propagated to the Ceph Object Gateway container due to missing container bind mounts on the /etc/pki/ca-trusted/ directory. This bug has been fixed, and SSL works as expected with containerized Ceph Object Gateways.

(BZ#1684283)

The rolling-upgrade.yml playbook now restarts all OSDs as expected

Due to a bug in a regular expression, the rolling-upgrade.yml playbook did not restart OSDs that used Non-volatile Memory Express devices. The regular expression has been fixed, and rolling-upgrade.yml now restarts all OSDs as expected.

(BZ#1687828)

4.2. Ceph Management Dashboard

The OSD node details are now displayed in the Host OSD Breakdown panel as expected

Previously, in the Red Hat Ceph Storage Dashboard, the Host OSD Breakdown information was not displayed on the OSD Node Detail panel under the All OSD Overview section. With this update, the underlying issue has been fixed, and the OSD node details are displayed as expected.

(BZ#1610876)

4.3. Ceph File System

The Ceph Metadata Server no longer allows recursive stat rctime to go backwards

Previously, the Ceph Metadata Server used the client’s time to update rctime. But because client time may not be synchronized with the MDS, the inode rctime could go backwards. The underlying source code has been modified, and the Ceph Metadata Server no longer allows recursive stat rctime to go backwards.

(BZ#1632506)

The ceph-fuse client no longer indicates incorrect recursive change time

Previously, the ceph-fuse client did not update change time when file content was modified. Consequently, incorrect recursive change time was indicated. With this update, the bug has been fixed, and the client now indicates the correct change time.

(BZ#1632509)

The Ceph MDS no longer allows dumping of cache larger than 1 GB

Previously, if you attempted to dump a Ceph Metadata Server (MDS) cache with a size of around 1 GB or larger, the MDS could terminate unexpectedly. With this update, MDS no longer allows dumping of cache that size so the MDS no longer terminates in the described situation.

(BZ#1636037)

When Monitors cannot reach an MDS, they no longer incorrectly mark its rank as damaged

Previously, Monitors were evicting and fencing an unreachable Metadata Server (MDS), then MDS was signaling that its rank was damaged due to improper handling of blacklist errors. Consequently, Monitors were incorrectly marking the rank as damaged, and the file system became unavailable because of one or more damaged ranks. In this release, the Monitors are setting the correct rank.

(BZ#1652464)

The reconnect timeout for MDS clients has been extended

When the Metadata Server (MDS) daemon was handling a large number of reconnecting clients with a huge number of capabilities to aggregate, the reconnect timeout was reached. Consequently, the MDS rejected clients that attempted to reconnect. With this update, the reconnect timeout has been extended, and MDS now handles reconnecting clients as expected in the described situation.

(BZ#1656969)

Shrinking large MDS cache no longer causes the MDS daemon to appear to hang

Previously, an attempt to shrink a large Metadata Server (MDS) cache caused the primary MDS daemon to become unresponsive. Consequently, Monitors removed the unresponsive MDS and a standby MDS became the primary MDS. With this update, shrinking large MDS cache no longer causes the primary MDS daemon to hang.

(BZ#1664468)

4.4. Ceph Manager Plugins

HDD and SSD devices can now be mixed when accessing the /osd endpoint

Previously, the Red Hat Ceph Storage RESTful API did not handle when HDD and SSD devices were mixed when accessing the /osd endpoint and returned an error. With this update, the OSD traversal algorithm has been improved to handle this scenario as expected.

(BZ#1594746)

4.5. The ceph-volume Utility

ceph-volume does not break custom named clusters

When using a custom storage cluster name other than ceph, the OSDs could not start after a reboot. With this update, ceph-volume provisions OSDs in a way that allows them to boot properly when a custom name is used.

Important

Despite this fix, Red Hat does not support clusters with custom names. This is because the upstream Ceph project removed support for custom names in the Ceph OSD, Monitor, Manager, and Metadata server daemons. The Ceph project removed this support because it added complexities to systemd unit files. This fix was created before the decision to remove support for custom cluster names was made.

(BZ#1621901)

4.6. Containers

Deploying encrypted OSDs in containers by using ceph-disk works as expected

When attempting to deploy a containerized OSD by using the ceph-disk and dmcrypt utilities, the container process failed to start because the OSD ID could not be found by the mounts table. With this update, the OSD ID is correctly determined, and the container process no longer fails.

(BZ#1695852)

4.7. Object Gateway

CivetWeb was rebased to upstream version 1.10 and the enable_keep_alive CivetWeb option works as expected

When using the Ceph Object Gateway with the CivetWeb front end, the CivetWeb connections timed out despite the enable_keep_alive option enabled. Consequently, S3 clients that did not reconnect or retry were not reliable. With this update, CivetWeb has been updated, and the enable_keep_alive option works as expected. As a result, CivetWeb connections no longer time out in this case.

In addition, the new CivetWeb version introduces more strict header checks. This new behavior can cause certain return codes to change because invalid requests are detected sooner. For example, in previous version CivetWeb returned the 403 Forbidden error on an invalid HTTP request, but in the new version it returns the 400 Bad Request error instead.

(BZ#1670321)

Red Hat Ceph Storage passes the Swift Tempest test in the RefStack 15.0 toolset

Various improvements have been made to the Ceph Object Gateway Swift service. As a result, when configured correctly, Red Hat Ceph Storage 3.2, which includes the ceph-12.2.8 package, passes the Swift Tempest tempest.api.object_storage test suite with the exception of the test_container_synchronization test case. Red Hat Ceph Storage includes a different synchronization model, multisite operations, for users who require that feature.

(BZ#1436386)

Mounting the NFS Ganesha file server in a containerized IPv6 cluster no longer fails

When a containerized IPv6 Red Hat Ceph Storage cluster with an nfs-ganesha-rgw daemon was deployed by using the ceph-ansible utility, an attempt to mount the NFS Ganesha file server on a client failed with the Connection Refused error. Consequently, I/O requests were unable to run. This update fixes the default configuration IPv6 connections, and mounting the NFS Ganesha server works as expected in this case.

(BZ#1566082)

Stale lifecycle configuration data of deleted buckets no longer persists in OMAP consuming space

Previously, in the Ceph Object Gateway (RGW), incorrect key formatting in the RGWDeleteLC::execute()`function caused bucket lifecycle configuration metadata to persist after the deletion of the corresponding bucket. This caused stale lifecycle configuration data to persist in `OMAP consuming space. With this update, the correct name for the lifecycle object is now used in RGWDeleteLC::execute(), and the lifecycle configuration is removed as expected on removal of the corresponding bucket.

(BZ#1588731)

The Keystone credentials were moved to an external file

When using the Keystone identity service to authenticate a Ceph Object Gateway user, the Keystone credentials were set as plain text in the Ceph configuration file. With this update, the Keystone credentials are configured in an external file that only the Ceph user can read.

(BZ#1637529)

Wildcard policies match objects with colons in the name

Previously, using colons in the name caused an error in a matching function not allowing wildcards to match beyond colons. In this release, colons can be used to match objects.

(BZ#1650674)

Lifecycle rules with multiple tag filters are no longer rejected

Due to a bug in lifecycle rule processing, an attempt to install the lifecycle rules with multiple tag filters was rejected and the InvalidRequest error message was returned. With this update, other rule forms are used, and lifecycle rules with multiple tag filters are no longer rejected.

(BZ#1654588)

An object can no longer be deleted even if a bucket or user policy with DENY s3:DeleteObject exists

Previously, this issue was caused by an incorrect value being returned by a method which evaluates policies. In this release, the correct value is being returned.

(BZ#1654694)

The Ubuntu nfs_ganesha package did not install the systemd unit file properly

When running systemctl enable nfs-ganesha the following error would be printed: Failed to execute operation: No such file or directory. This was because the nfs-ganesha-lock.service file was not created properly. With this release, the file is created properly and the nfs-ganehsa service can be enabled successfully.

(BZ#1660063)

The Ceph Object Gateway supports a string as a delimiter

Invalid logic was used to find and project a delimiter sequence greater than one character. This was causing the Ceph Object Gateway to fail any request with a string as the delimiter, returning an invalid utf-8 character message. The logic handling the delimiter has been replaced by an 8-bit shift-carry equivalent. As a result, a string delimiter will work correctly. Red Hat has only tested this against the US-ascii character set.

(BZ#1660962)

Mapping NFS exports to Object Gateway tenant user IDs works as expected

Previously, the NFS server for the Ceph Object Gateway (nfs-ganesha) did not correctly map Object Gateway tenants into their correct namespace. As a consequence, an attempt to map an NFS export onto Ceph Object Gateway with a tenanted user ID silently failed; the account could authenticate and NFS mounts could succeed, but the namespace did not contain buckets and objects. This bug has been fixed, and tenanted mappings are now set correctly. As a result, NFS exports can now be mapped to Object Gateway tenant user IDs and buckets and objects are visible as expected in the described situation.

(BZ#1661882)

An attempt to get bucket ACL for non-existing bucket returns an error as expected

Previously, an attempt to get bucket Access Control Lists (ACL) for a non-existent bucket by calling the GetBucketAcl() function returned a result instead of returning a NoSuchBucket error. This bug has been fixed, and the NoSuchBucket error is returned in the aforementioned situation.

(BZ#1667142)

The log level for gc_iterate_entries has been changed to 10

Previously, the log level for the gc_iterate_entries log message was set to 0. As a consequence, OSD log files included unnecessary information and could grow significantly. With this update, the log level for gc_iterate_entries has been changed to 10.

(BZ#1671169)

Garbage collection no longer consumes bandwidth without making forward progress

Previously, some underlying bugs prevented garbage collection (GC) from making forward progress. Specifically, the marker was not always being advanced, GC was unable to process entries with zero-length chains, and the truncated flag was not always being set correctly. This caused GC to consume bandwidth without making any forward progress, thereby not freeing up disk space, slowing down other cluster work, and allowing OMAP entries related to GC to continue to increase. With this update, the underlying bugs have been fixed, and GC is able to make progress as expected freeing up disk space and OMAP entries.

(BZ#1674436)

The radosgw-admin utility no longer gets stuck and creates high read operations when creating greater than 999 buckets per user

An issue with a limit check caused the radosgw-admin utility to never finish when creating 1,000 or more buckets per user. This problem has been fixed and radosgw-admin no longer gets stuck or creates high read operations.

(BZ#1679263)

LDAP authentication is available again

Previously, a logic error caused LDAP authentication checks to be skipped. Consequently, the LDAP authentication was not available. With this update, the checks for a valid LDAP authentication setup and credentials have been fixed, and LDAP authentication is available again.

(BZ#1687800)

NFS Ganesha no longer aborts when an S3 object name contains a // sequence

Previously, the NFS server for the Ceph Object Gateway (RGW NFS) would abort when as S3 object name contained a // sequence. With this update, RGW NFS ignores such sequence as expected and no longer aborts.

(BZ#1687970)

Expiration time is calculated the same as S3

Previously, a Ceph Object Gateway computed relative object’s life cycle expiration rules from the time of creation, rather than rounded to midnight UTC as in AWS. This could cause the following error: botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the PutBucketLifecycleConfiguration operation: 'Date' must be at midnight GMT. Expiration is now rounded to midnight UTC for greater AWS compatibility.

(BZ#1688330)

Operations waiting for resharding to complete are able to complete after resharding

Previously, when using dynamic resharding, some operations that were waiting to complete after resharding failed to complete. This was due to code changes to the Ceph Object Gateway when automatically cleaning up no longer used bucket index shards. While this reduced storage demands and eliminated the need for manual clean up, the process removed one source of an identifier needed for operations to complete after resharding. The code has been updated so that identifier is retrieved from a different source after resharding and operations requiring it can now complete.

(BZ#1688378)

radosgw-admin bi put now sets the correct mtime time stamp

Previously, the radosgw-admin bi put command did not set the mtime time stamp correctly. This bug has been fixed.

(BZ#1688541)

Ceph Object Gateway lifecycle works properly after a bucket is resharded

Previously, after a bucket was resharded using the dynamic resharding feature, if a lifecycle policy was applied to the bucket, it did not complete and the policy failed to update the bucket. With this update to Red Hat Ceph Storage, a lifecycle policy is properly applied after resharding of a bucket.

(BZ#1688869)

The RGW server no longer returns an incorrect S3 error code NoSuchKey when asked to return non-existent CORS rules

Previously, the Ceph Object Gateway (RGW) sever would return an incorrect S3 error code NoSuchKey when asked to return non-existent CORS rules. This caused the s3cmd tool and other programs to misbehave. With this update, the RGW server now returns NoSuchCORSConfiguration for this case, and the s3cmd tool and other programs that expect this error behave correctly.

(BZ#1689410)

Decrypting multipart uploads was corrupting data

When doing multipart uploads with SSE-C, the part size was not a multiple of the 4k encryption block size. While the multipart uploads were encrypted correctly, the decryption process failed to account for the part boundaries and was returning corrupted data. With this release, the decryption process correctly handles the part boundaries when using SSE-C. As a result, all encrypted multipart uploads can be successfully decrypted.

(BZ#1690941)

4.8. Object Gateway Multisite

Redundant multi-site replication sync errors were moved to debug level 10

A few multi-site replication sync errors were logged multiple times at log level 0 and consumed extra space in logs. This update moves the redundant messages to debug level 10 to hide them from the log.

(BZ#1635381)

Buckets with false entries can now be deleted as expected

Previously, bucket indices could include "false entries" that did not represent actual objects and that resulted from a prior bug. Consequently, during the process of deleting such buckets, encountering a false entry caused the process to stop and return an error code. With this update, when a false entry is encountered, Ceph ignores it, and deleting buckets with false entries works as expected.

(BZ#1658308)

Datalogs are now trimmed regularly as expected

Due to a regression in decoding of the JSON format of data sync status objects, automated datalog trimming logic was unable to query the sync status of its peer zones. Consequently, the datalog trimming process did not progress. This update fixes the JSON decoding and adds more regression test coverage for log trimming. As a result, datalogs are now trimmed regularly as expected.

(BZ#1662353)

Objects are now synced correctly in versioning-suspended buckets

Due to a bug in multi-site sync of versioning-suspended buckets, certain object versioning attributes were overwritten with incorrect values. Consequently, the objects failed to sync and attempted to retry endlessly, blocking further sync progress. With this update, the sync process no longer overwrites versioning attributes. In addition, any broken attributes are now detected and repaired. As a result, objects are synced correctly in versioning-suspended buckets.

(BZ#1663570)

Objects are now synced correctly in versioning-suspended buckets

Due to a bug in multi-site sync of versioning-suspended buckets, certain object versioning attributes were overwritten with incorrect values. Consequently, the objects failed to sync and attempted to retry endlessly, blocking further sync progress. With this update, the sync process no longer overwrites versioning attributes. In addition, any broken attributes are now detected and repaired. As a result, objects are synced correctly in versioning-suspended buckets.

(BZ#1690927)

Buckets with false entries can now be deleted as expected

Previously, bucket indices could include "false entries" that did not represent actual objects and that resulted from a prior bug. Consequently, during the process of deleting such buckets, encountering a false entry caused the process to stop and return an error code. With this update, when a false entry is encountered, Ceph ignores it, and deleting buckets with false entries works as expected.

(BZ#1690930)

radosgw-admin sync status now shows timestamps for master zone

Previously in Ceph Object Gateway multisite, running radosgw-admin sync status on the master zone did not show timestamps, which made it difficult to tell if data sync was making progress. This bug has been fixed, and timestamps are shown as expected.

(BZ#1692555)

Synchronizing a multi-site Ceph Object Gateway was getting stuck

When recovering versioned objects, other operations were unable to finish. These stuck operations were caused by the removing of expired user.rgw.olh.pending extended attributes (xattrs) all at once on those versioned objects. Another bug was causing too many of the user.rgw.olh.pending xattrs to be written to those recovering versioned objects. With this release, batches of expired xattrs are removed instead of all at once. This results in versioned objects recovering correctly so other operations can proceed normally.

(BZ#1693445)

A multi-site Ceph Object Gateway is not trimming the data and bucket index logs

Configuring zones for a multi-site Ceph Object Gateway without setting the sync_from_all option, was causing the data and bucket index logs not to be trimmed. With this release, the automated trimming process only consults the synchronization status of peer zones that are configured to synchronize. As result, this allows the data and bucket index logs to be trimmed properly.

(BZ#1699478)

4.9. RADOS

A PG repair no longer sets the storage cluster to a warning state

When doing a repair of a placement group (PG) it was considered a damaged PG. This was placing the storage cluster into a warning state. With this release, repairing a PG does not place the storage cluster into a warning state.

(BZ#1506782)

The ceph-mgr daemon no longer crashes after starting balancer module in automatic mode

Previously, due to a CRUSH bug, invalid mappings were created. When an invalid mapping was encountered in the _apply_upmap function, the code caused a segmentation fault. With this release, the code has been updated to check that the values are within an expected range. If not, the invalid values are ignored.

(BZ#1593110)

RocksDB compaction no longer exhausts free space of BlueFS

Previously, the balancing of free space between main storage and storage for RocksDB, managed by BlueFS, happened only when write operations were underway. This caused an ENOSPC error for BlueFS to be returned when RocksDB compaction was triggered right before long interval without write operations. With this update, the code has been modified to periodically check free space balance even if no write operations are ongoing so that compaction no longer exhausts free space of BlueFS.

(BZ#1600138)

PGs per OSD limits have been increased

In some situations, such as widely varying disk sizes, the default limit on placement groups (PGs) per OSD could prevent PGs from going active. These limits have been increased by default to make this situation less likely.

(BZ#1633426)

Ceph installation no longer fails when FIPS mode is enabled

Previously, installing Red Hat Ceph Storage using the ceph-ansible utility failed at TASK [ceph-mon : create monitor initial keyring] when FIPS mode was enabled. To resolve this bug, the symmetric cipher cryptographic key is now wrapped with a one-shot wrapping key before it is used to instantiate the cipher. This allows Red Hat Ceph Storage to install normally when FIPS mode is enabled.

(BZ#1636251)

Slow request messages have been re-added to the OSD logs

Previously, slow request messages were removed from the OSD logs, which made debugging harder. This update re-adds these warnings to the OSD logs.

(BZ#1659156)

Force backfill and recovery preempt a lower priority backfill or recovery

Previously, force backfill or force recovery did not preempt an already running recovery or backfill process. As a consequence, although force backfill or recovery set priority to the max value, recovery process for placement groups (PGs) already running at a lower priority was finished first. With this update, force backfill and recovery preempt a lower priority backfill or recovery processes.

(BZ#1668362)

Ceph Manager no longer crashes when two or more Ceph Object Gateway daemons use the same name

Previously, when two or more Ceph Object Gateway daemons used the same name in a cluster, Ceph Manager terminated unexpectedly. The underlying source code has been modified, and Ceph Manager no longer crashes in the described scenario.

(BZ#1670781, BZ#1634964)

A race condition was causing threads to deadlock with the standby ceph-mgr daemon

Some threads can cause a race condition when acquiring a local lock and the Python global interpreter lock, which is causing a deadlock issue for each thread. As the thread holds on to one of the locks, it wants to acquire the other lock, but cannot. In this release, the code was fixed to close the window of opportunity for the race condition to occur. This is done by changing the location of the lock acquisition and releasing the appropriate locks. Doing this results in the threads not causing a deadlock, which allows progress to be made.

(BZ#1674549)

An OSD daemon no longer crashes when a block device has read errors

Previously, an OSD daemon would crash when a block device had read errors, because the daemon expected only a general EIO error code, not the more specific errors the kernel generates. With this release, low-level errors are mapped to EIO, resulting in an OSD daemon not crashing because of an unrecognized error code.

(BZ#1678470)

Read retries no longer cause the client to hang after a failed sync read

Previously, when an OSD daemon failed to sync read an object, the length of the object to be read was set to 0. This caused the read retry to incorrectly read the entire object. The underlying code has been fixed, and the read retry uses the correct length and does not cause the client to hang.

(BZ#1682966)

4.10. Block Devices (RBD)

The python-rbd list_snaps() method no longer segfaults after an error

This issue was discovered with OpenStack Cinder Backup when rados_connect_timeout was set. Normally the timeout is not enabled. If the cluster was highly loaded the timeout could be reached, causing the segfault. With this update to Red Hat Ceph Storage, if the timeout is reached a segfault no longer occurs.

(BZ#1655681)