Chapter 7. Notable Bug Fixes

This section describes bugs fixed in this release of Red Hat Ceph Storage that have significant impact on users. In addition, it includes descriptions fixed known issues from previous versions.

Improvements in handling of full OSDs

When an OSD disk became so full that the OSD could not function, the OSD terminated unexpectedly with a confusing assert message. With this update:

  • The error message has been improved.
  • By default, no more than 25% of OSDs are automatically marked as out.
  • The statfs calculation in FileStore or BlueStore back ends have been improved to better reflect the disk usage.

As a result, OSDs are less likely to become full and if they do, a more informative error message is added to the log. (BZ#1332083)

Split threshold is now randomized

Previously, the split threshold was not randomized, so that many OSDs reached it at the same time. As a consequence, such OSDs incurred high latency because they all split directories at once. With this update, the split threshold is randomized which ensures that OSDs split directories over a large period of time. (BZ#1337018)

Mirroring image metadata is supported

Image metadata are now replicated to a peer cluster as expected. (BZ#1344212)

Dynamic feature updates are now replicated

When a feature was disabled or enabled on an already existing image and the image was mirrored to a peer cluster, the feature was not disabled or enabled on the replicated image. With this update, dynamic features updates are replicated as expected. (BZ#1344262)

Disabling image features is no longer incorrectly allowed on non-primary images

With RADOS Block Device (RBD) mirroring enabled, non-primary images are expected to be read-only. Previously, an attempt to disable image features on non-primary images could cause an indefinite wait. This operation is now properly disallowed on non-primary images. As a result, an attempt to disable image features on such images fails with an appropriate error message. (BZ#1353877)

The rbd bench write command no longer fails when --io-size is equal to the image size

Previously, the rbd bench-write --io-size <size> <image> command failed with a segmentation fault if the size specified by the --io-size option was greater than 4 GB. With this update, the option is restricted from being too large. (BZ#1362014)

Creating a new pool after manually modifying the CRUSH map and removing a CRUSH ruleset no longer causes issues

Previously, creating a new pool after manually modifying the CRUSH map and removing a CRUSH ruleset caused the newly created pool to use rule_id rather than the specified ruleset. This lead to other issues in the cluster, such as the inability to unprotect snapshots because the newly created pool was in an incorrect state. The underlying issue has been fixed, and the newly created pools have the correct specified CRUSH ruleset and behave as expected. (BZ#1369586)

AWS SDK for Golang applications work as expected with the Ceph Object Gateway

A bug in the URL processing in the Civetweb HTTP server caused certain kinds of Simple Storage Service (S3) requests to fail. The affected requests included for example a number of requests generated by clients of the Amazon Web Services (AWS) Software Development Kit (SDK) for Golang. Consequently, S3 applications written for AWS SDK for Golang did not interact correctly with the Ceph Object Gateway. This update fixes the handling of absolute URIs is Civetweb, and the AWS SDK for Golang applications work as expected with the Ceph Object Gateway. (BZ#1387437)

The --rbd-concurrent-management-ops option works with the rbd export command

The --rbd-concurrent-management-ops option ensures that image export or import work in parallel. Previously, when --rbd-concurrent-management-ops was used with the rbd export command, it had no effect on the command performance. The underlying source code has been modified, and --rbd-concurrent-management-ops works as expected when exporting images by using rbd export. (BZ#1410923)

rolling_update no longer sets and unsets flags in between each OSD upgrade

The rolling_update playbook of the ceph-ansible utility set and unset the noout, noscrub, and nodeep-scrub flags in between each OSD upgrade. If a scrubbing process was scheduled to start shortly or was in progress, setting these flags did not stop scrubbing immediately, and rolling_update waited until scrubbing was finished. This process was repeated on each OSD with scheduled scrubbing or scrubbing in progress. This behavior caused the upgrade process to take considerable time to finish. This update ensures that the flags are set before upgrading all OSDs, and are unset after all OSDs are upgraded. (BZ#1450754)

Using IPv6 addressing is now supported with containerized Ceph clusters

Previously, an attempt to deploy a Ceph cluster as a container image failed if IPv6 addressing was used. With this update, IPv6 addressing is supported. (BZ#1451786)

Delete operations are handled during recovery, not peering

When a large number of delete operations were in a client workload, a disk could be easily saturated during peering, which caused very high latency, because the delete operations did not go through the operations queue or do any batching. With this update the delete operations are handled during recovery, instead of peering. (BZ#1451936)

A heartbeat message for Jumbo frames has been added

Previously, if a network included jumbo frames and the maximum transmission unit (MTU) was not configured properly on all network parts, a lot of problems, such as slow requests, and stuck peering and backfilling processes occurred. In addition, the OSD logs did not include any heartbeat timeout messages because the heartbeat message packet size is below 1500 bytes. This update adds a heartbeat message for Jumbo frames. (BZ#1455711)

Upgrading a containerized Ceph cluster by using rolling_update.yml is supported

Previously, after upgrading a containerized Ceph cluster by using the rolling_update.yml playbook, the ceph-mon daemons were not restarted. As a consequence, they were unable to join the quorum after the upgrade. With this update, upgrading containerized Ceph clusters with rolling_update.yml works as expected. For details, see the Upgrading a Red Hat Ceph Storage Cluster That Runs in Containers section in the Container Guide for Red Hat Ceph Storage 3. (BZ#1458024)

OSD activation no longer fails when running the script in the Ceph container when a cluster name contains numbers

Previously, in the Ceph container image the script considered all numbers included in a cluster name as an OSD ID. As a consequence, OSD activation failed when running the script because the script was seeking a keyring on a path based on an OSD ID that did not exist. The underlying issue has been fixed, and OSD activation no longer fails when the name of a cluster in a container contains numbers. (BZ#1458512)

Unsupported playbooks are no longer available

The /usr/share/ceph-ansible/infrastructure-playbooks/ directory no longer includes unsupported playbooks. (BZ#1461551)

New health checks with more structure

Previously, during the installation of a Red Hat Ceph Storage cluster, Ceph raised spurious health warnings. The health checks have been improved to be more structured and no longer trigger health warnings on healthy clusters. (BZ#1464964)

Ceph no longer creates pools by default

Previously, rbd pools were created by default upon Ceph cluster creation. This caused several problems, including unnecessary health warnings. Pools are now created only by the user based on their needs rather than by default. (BZ#1464966)

Deleting objects no longer leaves stale bucket index entries

Previously, when objects were removed from the Ceph Object Gateway, the radosgw daemon could fail to remove the entries of the deleted objects due to a time scaling error. This bug has been fixed, and radosgw removes the bucket index entries as expected. (BZ#1472874)

Large objects are no longer truncated

When creating large objects on large clusters, some of the objects were truncated at 512 KB size. Consequently, an attempt to read such objects failed with Error 404. This bug has been fixed, and large objects are no longer truncated. As a result, reading such objects works as expected. (BZ#1473405)

The --inconsistent-index option has been restricted

Using the --inconsistent-index option with the radosgw-admin bucket rm command could cause corruption of the bucket index if the command failed or was stopped. With this update, usage of --inconsistent-index requires a confirmation from users (the --yes-i-really-mean-it option), and a warning is printed when attempting to use this option. (BZ#1477311)

Restarting rbd-mirror is no longer required after a non-orderly shutdown

In RBD mirroring configuration, the local non-primary images could not be force promoted after a non-orderly shutdown of the remote cluster. Consequently, if this happened, and the rbd-mirror daemon was not restarted on the local cluster, it was not possible to promote the image because the rbd-mirror did not release the exclusive lock. This bug has been fixed, and restarting rbd-mirror is no longer required in this case. (BZ#1479673)

Using the site.yml playbook with the --limit option works as expected

When using the site.yml playbook with the --limit option set to osd, clients, or rgws to deploy a cluster, the playbook created an incorrect configuration file with missing values. The playbook now uses the delegate_facts option that allows the playbook to instruct hosts to get information from other hosts that are not part of the current play, in this case Monitor hosts. As a result, the playbook creates a proper configuration file in the described scenario. (BZ#1482067)

The number of PGs per OSD is now limited

Previously, it was possible to create pools that included a large number of placement groups (PGs) which could overload the cluster. This update introduces a new configuration option, mon_max_pg_per_osd, that limits the number of PGs per OSD to 200. Creating pools or adjusting the pg_num parameter now fails if the change would make the number of PGs per OSD exceed the configured limit. You can adjust this option in the Ceph configuration file. In addition, the mon_pg_warn_max_per_osd option has been removed. (BZ#1489064)

Slow OSD startup after upgrading to Red Hat Ceph Storage 3.0

Ceph Storage Clusters that have large omap databases experience slow OSD startup due to scanning and repairing during the upgrade from Red Hat Ceph Storage 2.x to 3.0. The rolling update may take longer than the specified time out of 5 minutes. Before running the Ansible rolling_update.yml playbook, set the handler_health_osd_check_delay option to 180 in the group_vars/all.yml file. (BZ#1549293)