Chapter 6. Known issues
This section documents known issues found in this release of Red Hat Ceph Storage.
6.1. The Cephadm utility
The Cephadm fails the upgrade of the Red Hat Ceph Storage version if the OS is unsupported
Currently, Cephadm does not manage the host operating system (OS), therefore, during an upgrade, it does not verify if the Red Hat Ceph Storage version the user is upgrading to is supported on the OS of the Ceph cluster nodes.
As a workaround, you can manually check the OS on which the Red Hat Ceph Storage version is supported, and follow the recommended upgrade path for OS and Red Hat Ceph Storage versions. This ensures that the Cephadm upgrades the cluster without raising any warning or error, even when the host OS of the nodes is unsupported for that Red Hat Ceph Storage release.
6.2. Ceph Object Gateway
Resharding a bucket removes the bucket’s metadata
Resharding a bucket removes the bucket’s metadata if the bucket was created with bucket_index_max_shards
as 0
. You can recover the affected buckets by restoring the bucket index.
The recovery can be done in two ways:
-
By executing
radosgw-admin object reindex --bucket BUCKET_NAME --object OBJECT_NAME
command. -
By executing the script
rgw-restore-bucket-index [--proceed] BUCKET_NAME [DATA_POOL_NAME]
. This script in turn invokesradosgw-admin object reindex …
.
Post performing the above steps, ensure to perform either a radosgw-admin bucket list
or radosgw-admin radoslist
command on the bucket for the bucket stats to correctly reflect the number of objects in the bucket.
Prior to the execution of the script, perform microdnf install jq
inside the cephadm
shell. The tool does not work for versioned buckets.
[root@argo031 ~]# time rgw-restore-bucket-index --proceed serp-bu-ver-1 default.rgw.buckets.data NOTICE: This tool is currently considered EXPERIMENTAL. `marker` is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5. `bucket_id` is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5. Error: this bucket appears to be versioned, and this tool cannot work with versioned buckets.
The tool’s scope is limited to a single site only and not on a multi-site. If you execute the rgw-restore-bucket-index
tool at site-1, it does not recover objects on site-2 and vice versa. On a multi-site, the recovery tool and the object reindex command should be executed at both sites for a bucket.
6.3. Ceph Dashboard
The Red Hat Ceph StorageDashboard shows NaN undefined
in some fields in the host table
Currently, when a new host is added, it takes some time to load its daemons, devices and other stats. During this time delay, data may not be available for some fields in the host table, as a result, during expansion, the host adds NAN undefined
for those fields.
When the data is not available for some field in the host table, it shows N/A. Presently, there is no workaround for this issue.
”Throughput-optimized” option is recommended for clusters containing SSD and NVMe devices
Previously, whenever the cluster had either only SSD devices or both SSDs and NVMe devices, the “Throughput-optimized” option would be recommended, even though it should not be and it had no impact either on the user or the cluster.
As a workaround, users can use the “Advanced” mode for deploying OSDs according to their desired specifications and all the options in the “Simple” mode are still usable apart from this UI issue.
6.4. Multi-site Ceph Object Gateway
Multi-site replication may stop during upgrade
Multi-site replication may stop if clusters are on different versions during the process of an upgrade. We would need to suspend sync until both clusters are upgraded to the same version.
6.5. RADOS
The mclock_scheduler
has performance issues with small object workloads and OSDs created on HDD devices
The mclock_scheduler
has performance issues with small object workloads and with OSDs created on HDD devices. Due to this, with small object workloads, client throughput is impacted due to on-going recovery operations.
The Ceph OSD benchmark test might get skipped
Currently, the Ceph OSD benchmark test boot-up might sometimes not run even with the osd_mclock_force_run_benchmark_on_init
parameter set to true
. As a consequence, the osd_mclock_max_capacity_iops_[hdd,ssd]
parameter value is not overridden with the default values.
As a workaround, perform the following steps:
Set
osd_mclock_force_run_benchmark_on_init
totrue
:Example
[ceph: root@host01 /]# ceph config set osd osd_mclock_force_run_benchmark_on_init true
Remove the value on the respective OSD:
Syntax
ceph config rm OSD.OSD_ID osd_mclock_max_capacity_iops_[hdd,ssd]
Example
[ceph: root@host01 /]# ceph config rm osd.0 osd_mclock_max_capacity_iops_hdd
- Restart the OSD
This results in the osd_mclock_max_capacity_iops_[ssd,hdd]
parameter being either set with the default value or the new value if it is within the threshold setting.