Chapter 6. Known issues

This section documents known issues found in this release of Red Hat Ceph Storage.

6.1. The Cephadm utility

The Cephadm fails the upgrade of the Red Hat Ceph Storage version if the OS is unsupported

Currently, Cephadm does not manage the host operating system (OS), therefore, during an upgrade, it does not verify if the Red Hat Ceph Storage version the user is upgrading to is supported on the OS of the Ceph cluster nodes.

As a workaround, you can manually check the OS on which the Red Hat Ceph Storage version is supported, and follow the recommended upgrade path for OS and Red Hat Ceph Storage versions. This ensures that the Cephadm upgrades the cluster without raising any warning or error, even when the host OS of the nodes is unsupported for that Red Hat Ceph Storage release.

(BZ#2161325)

6.2. Ceph Object Gateway

Resharding a bucket removes the bucket’s metadata

Resharding a bucket removes the bucket’s metadata if the bucket was created with bucket_index_max_shards as 0. You can recover the affected buckets by restoring the bucket index.

The recovery can be done in two ways:

  • By executing radosgw-admin object reindex --bucket BUCKET_NAME --object OBJECT_NAME command.
  • By executing the script rgw-restore-bucket-index [--proceed] BUCKET_NAME [DATA_POOL_NAME]. This script in turn invokes radosgw-admin object reindex …​.

Post performing the above steps, ensure to perform either a radosgw-admin bucket list or radosgw-admin radoslist command on the bucket for the bucket stats to correctly reflect the number of objects in the bucket.

Note

Prior to the execution of the script, perform microdnf install jq inside the cephadm shell. The tool does not work for versioned buckets.

[root@argo031 ~]# time rgw-restore-bucket-index  --proceed serp-bu-ver-1 default.rgw.buckets.data

NOTICE: This tool is currently considered EXPERIMENTAL.
`marker` is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5.
`bucket_id` is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5.

Error: this bucket appears to be versioned, and this tool cannot work with versioned buckets.

The tool’s scope is limited to a single site only and not on a multi-site. If you execute the rgw-restore-bucket-index tool at site-1, it does not recover objects on site-2 and vice versa. On a multi-site, the recovery tool and the object reindex command should be executed at both sites for a bucket.

(BZ#2178991)

6.3. Ceph Dashboard

The Red Hat Ceph StorageDashboard shows NaN undefined in some fields in the host table

Currently, when a new host is added, it takes some time to load its daemons, devices and other stats. During this time delay, data may not be available for some fields in the host table, as a result, during expansion, the host adds NAN undefined for those fields.

When the data is not available for some field in the host table, it shows N/A. Presently, there is no workaround for this issue.

(BZ#2046214)

”Throughput-optimized” option is recommended for clusters containing SSD and NVMe devices

Previously, whenever the cluster had either only SSD devices or both SSDs and NVMe devices, the “Throughput-optimized” option would be recommended, even though it should not be and it had no impact either on the user or the cluster.

As a workaround, users can use the “Advanced” mode for deploying OSDs according to their desired specifications and all the options in the “Simple” mode are still usable apart from this UI issue.

(BZ#2101680)

6.4. Multi-site Ceph Object Gateway

Multi-site replication may stop during upgrade

Multi-site replication may stop if clusters are on different versions during the process of an upgrade. We would need to suspend sync until both clusters are upgraded to the same version.

(BZ#2178909)

6.5. RADOS

The mclock_scheduler has performance issues with small object workloads and OSDs created on HDD devices

The mclock_scheduler has performance issues with small object workloads and with OSDs created on HDD devices. Due to this, with small object workloads, client throughput is impacted due to on-going recovery operations.

(BZ#2174467)

The Ceph OSD benchmark test might get skipped

Currently, the Ceph OSD benchmark test boot-up might sometimes not run even with the osd_mclock_force_run_benchmark_on_init parameter set to true. As a consequence, the osd_mclock_max_capacity_iops_[hdd,ssd] parameter value is not overridden with the default values.

As a workaround, perform the following steps:

  1. Set osd_mclock_force_run_benchmark_on_init to true:

    Example

    [ceph: root@host01 /]# ceph config set osd osd_mclock_force_run_benchmark_on_init true

  2. Remove the value on the respective OSD:

    Syntax

    ceph config rm OSD.OSD_ID osd_mclock_max_capacity_iops_[hdd,ssd]

    Example

    [ceph: root@host01 /]# ceph config rm osd.0 osd_mclock_max_capacity_iops_hdd

  3. Restart the OSD

This results in the osd_mclock_max_capacity_iops_[ssd,hdd] parameter being either set with the default value or the new value if it is within the threshold setting.

(BZ#2126559)