Chapter 6. Known Issues

This section documents known issues found in this release of Red Hat Ceph Storage.

Ansible does not properly handle unresponsive tasks

Certain tasks, for example adding monitors with the same host name, cause the ceph-ansible utility to become unresponsive. Currently, there is no timeout set after which the unresponsive tasks is marked as failed. (BZ#1313935)

Certain image features are not supported with the RBD kernel module

The following image features are not supported with the current version of the RADOS Block Device (RBD) kernel module (krbd) that is included in Red Hat Enterprise Linux 7.4:

  • object-map
  • deep-flatten
  • journaling
  • fast-diff

RBDs may be created with these features enabled. As a consequence, an attempt to map the kernel RBDs by running the rbd map command fails.

To work around this issue, disable the unsupported features by setting the rbd_default_features = 1 option in the Ceph configuration file for kernel RBDs or dynamically disable them by running the following command:

rbd feature disable <image> <feature>

This issue is a limitation only in kernel RBDs, and the features work as expected with user-space RBDs.

NFS Ganesha does not show bucket size or number of blocks

NFS Ganesha, the NFS interface of the Ceph Object Gateway, lists buckets as directories. However, the interface always shows that the directory size and the number of blocks is 0, even if some data is written to the buckets. (BZ#1359408)

An LDAP user can access buckets created by a local RGW user with the same name

The RADOS Object Gateway (RGW) does not differentiate between a local RGW user and an LDAP user with the same name. As a consequence, the LDAP user can access the buckets created by the local RGW user.

To work around this issue, use different names for RGW and LDAP users. (BZ#1361754)

The GNU tar utility currently cannot extract archives directly into the Ceph Object Gateway NFS mounted file systems

The current version of the GNU tar utility makes overlapping write operations when extracting files. This behavior breaks the strict sequential write restriction in the current version of the Ceph Object Gateway NFS. In addition, GNU tar reports these errors in the usual way, but it also by default continues extracting the files after reporting the errors. As a result, the extracted files can contain incorrect data.

To work around this problem, use alternate programs to copy file hierarchies into the Ceph Object Gateway NFS. Recursive copying by using the cp -r command works correctly. Non-GNU archive utilities might be able to correctly extract the tar archives, but none have been verified. (BZ#1418606)

Old zone group name is sometimes displayed alongside with the new one

In a multi-site configuration when a zone group is renamed, other zones can in some cases continue to display the old zone group name in the output of the radosgw-admin zonegroup list command.

To work around this issue:

  1. Verify that the new zone group name is present on each cluster.
  2. Remove the old zone group name:

    $ rados -p .rgw.root rm zonegroups_names.<old-name>

    (BZ#1423402)

Failover and failback cause data sync issues in multi-site environments

In environments using the Ceph Object Gateway multi-site feature, failover and failback cause data sync to stall. This is because the radosgw-admin sync status command reports that data sync is behind for an extended period of time.

To workaround this issue, use the radosgw-admin data sync init command and restart the Gateways. (BZ#1459967)

It is not possible to remove directories stored on S3 versioned buckets by using rm

The mechanism that is used to check for non-empty directories prior to unlinking them works incorrectly in combination with the Ceph Object Gateway Simple Storage Service (S3) versioned buckets. As a consequence, directory trees on versioned buckets cannot be recursively removed with a command such as rm -rf. To work around this problem, remove any objects in versioned buckets by using the S3 interface. (BZ#1489301)

Deleting directories that contain symbolic links is slow

An attempt to delete directories and subdirectories on a Ceph File System that include a number of hard links by using the rm -rf command is significantly slower than deleting directories that do not contain any hard links. (BZ#1491246)

Resized LUNs are not immediately visible to initiators when using the iSCSI gateway

When using the iSCSI gateway, resized logical unit numbers (LUNs) are not immediately visible to initiators. This means the initiators are not able to see the additional space allocated to a LUN. To work around this issue, restart the iSCSI gateway after resizing a LUN to expose it to the initiators, or always add new LUNs when increasing storage capacity. All targets must be updated before utilizing the new space by the initiators. (BZ#1492342)

The Ceph Object Gateway requires applications to write sequentially

The Ceph Object Gateway requires applications to write sequentially from offset 0 to the end of a file. Attempting to write out of order causes the upload operation to fail. To work around this issue, use utilities like cp, cat, or rsync when copying files into NFS space. Always mount with the sync option. (BZ#1492589)

The Expiration, Days S3 Lifecycle parameter cannot be set to 0

The Ceph Object Gateway does not accept the value of 0 for the Expiration, Days Lifecycle configuration parameter. Consequently, setting the expiration to 0 cannot be used to trigger background delete operation of objects.

To work around this problem, delete objects directly. (BZ#1493476)

Load on MDS daemons is not always balanced fairly or evenly in multiple active MDS configurations

In certain cases, the MDS balancers offload too much metadata to another active daemon or none at all. (BZ#1494256)

User space issues make df calculations less accurate for kernel client users

User space improvements in df calculations have been accepted in the upstream kernel, but have not yet been packaged downstream. The df command reports more accurate free space data when a Ceph File System is mounted with the ceph-fuse utility. When mounted with the kernel client, 'df' reports the same, less accurate data as in previous versions. To work around this problem, kernel client users can use the ceph df command and examine the relevant data pools to determine free space more accurately. (BZ#1494987)

An iSCSI initiator can send more than max_data_area_mb worth of data when a Ceph cluster is under heavy load causing a temporary performance drop

When a Ceph cluster is under heavy load, an iSCSI initiator might send more data than specified by the max_data_area_mb parameter. Once the max_data_area_mb limit has been reached, the target_core_user module returns queue full statuses for commands. The initiators might not fairly retry these commands and they can hit initiator side time outs and be failed in the multipath layer. The multipath layer will retry the commands on another path while other commands are still being executed on the original path. This causes a temporary performance drop, and in some extreme cases in Linux environment the multipathd daemon can terminate unexpectedly.

If the multipathd daemon crashes, restart it manually:

# systemctl restart multipathd

(BZ#1500757)

The Ceph iSCSI gateway only supports clusters named "ceph"

The Ceph iSCSI gateway expects the default cluster name, that is "ceph". If a cluster uses a different name, the Ceph iSCSI gateway does not properly connect to the cluster. To work around this problem, use the default cluster name, or manually copy the content of the /etc/ceph/<cluster-name>.conf file to the /etc/ceph/ceph.conf file in addition to the associated keyrings. (BZ#1502021)

The stat command returns ID: 0 for CephFS FUSE clients

When a Ceph File System (CephFS) is mounted as a File System in User Space (FUSE) client, the stat command outputs ID: 0 instead of a proper ID. (BZ#1502384)

Having more than one path from an initiator to an iSCSI gateway is not supported

In the iSCSI gateway, tcmu-runner might return the same inquiry and Asymmetric logical unit access (ALUA) info for all iSCSI sessions to a target port group. This can cause the initiator or multipath layer to use the incorrect port info to reference the internal structures for paths and devices, which can result in failures, failover and failback failing, or incorrect multipath and SCSI log or tool output. Therefore, having more than one iSCSI session from an initiator to an iSCSI gateway is not supported. (BZ#1502740)

Incorrect number of tcmu-runner daemons reported after iSCSI target LUNs fail and recover

After iSCSI target Logical Unit Numbers (LUNs) recover from a failure, the ceph -s command in certain cases outputs an incorrect number of tcmu-runner daemons. (BZ#1503411)

The tcmu-runner daemon does not clean up its blacklisted entries upon recovery

When the path fails over from the Active/Optimized to Active/Non-Optimized path or vice-versa on a failback, the old target is blacklisted to prevent stale writes from occurring. These blacklist entries are not cleaned up after the tcmu-runner daemon recovers from being blacklisted, resulting in extraneous blacklisted clients until the entries expire after one hour. (BZ#1503692)

delete_website_configuration cannot be enabled by setting the bucket policy DeleteBucketWebsite

In the Ceph Object Gateway, a user cannot enable delete_website_configuration on a bucket even when a bucket policy has been written granting them S3:DeleteBucketWebsite permission.

To work around this issue, you can use other methods of permitting, for example, by using admin operations, by bucket owner, or by ACL. (BZ#1505400)

During a data rebalance of a Ceph cluster, the system might report degraded objects

Under certain circumstances, such as when an OSD is marked out, the number of degraded objects reported during a data rebalance of a Ceph cluster can be too high, in some cases implying a problem where none exists. (BZ#1505457)

The iSCSI gateway can fail to scan or setup LUNs

When using the iSCSI gateway, the Linux initiators can return the kzalloc failures due to buffers being too large. In addition, the VMWare ESX initiators can return the READ_CAP failures due to not being able to copy the data. As a consequence, the iSCSI gateway fails to scan or setup Logical Unit Numbers (LUNs), find or rediscover devices, and add the devices back after path failures. (BZ#1505942)

The RESTful API commands do not work as expected

The RESTful plug-in provides API to interact with a Ceph cluster. Currently, the API fails to change the pgp_num parameter. In addition, it indicates a failure when changing the pg_num parameter, despite pg_num being changed as expected. (BZ#1506102)

Adding LVM-based OSDs fail on clusters with other names than "ceph"

An attempt to install a new Ceph cluster or add OSDs by using the osd_scenario: lvm parameter fails on clusters that use other names than the default "ceph". To work around this problem on new clusters, use the default cluster name ("ceph"). (BZ#1507943)

The iSCSI gwcli utility does not support hyphens in pool or image names

It is not possible to create a disk using a pool or image name that includes hyphens ("-") by using the iSCSI gwcli utility. (BZ#1508451)

Ansible creates unused systemd unit files

When installing the Ceph Object Gateway by using the ceph-ansible utility, ceph-ansible creates systemd unit files for the Ceph Object Gateway host corresponding to all Object Gateway instances located on other hosts. However, only the unit file that corresponds to the hostname of the Ceph Object Gateway host is active. The rest of the unit files appear inactive, but this does not have any impact on the Ceph Object Gateways. (BZ#1508460)

The nfs-server must be disabled on the NFS Ganesha node

When the nfs-server service is running on the NFS Ganesha node, an attempt to start the NFS Ganesha instance after its installation fails. To work around this issue, ensure that nfs-server is stopped and disabled on the NFS Ganesha node before installing NFS Ganesha. To do so:

# systemctl disable nfs-server
# systemctl stop nfs-server

(BZ#1508506)

Assigning LUNs and hosts to a hostgroup using the iSCSI gwcli utility prevents access to the LUNs upon reboot of the iSCSI gateway host

After assigning Logical Unit Numbers (LUNs) and hosts to a hostgroup by using the iSCSI gwcli utiliy, if the iSCSI gateway host is rebooted, the LUN mappings are not properly restored for the hosts. This issue prevents access to the LUNs. (BZ#1508695)

nfs-ganesha.service fails to start after a crash or a process kill of NFS Ganesha

When the NFS Ganesha process terminates unexpectedly or it is killed, the nfs-ganesha.service daemon fails to start as expected. (BZ#1508876)

The ms_async_affinity_cores option does not work

The ms_async_affinitiy_cores option is not implemented. Specifying it in the Ceph configuration file does not have any effect. (BZ#1509130)

Ansible fails to install clusters that use custom group names in the Ansible inventory file

When the default values of the mon_group_name and osd_group_name parameters are changed in the all.yml file, Ansible fails to install a Ceph cluster. To avoid this issues, do not use custom group names in the Ansible inventory file by changing mon_group_name and osd_group_name. (BZ#1509201)

lvm installation scenario does not work when deploying Ceph in containers

It is not possible to use the osd_scenario: lvm installation method to install a Ceph cluster in containers. (BZ#1509230)

Compression ratio might not be the same on the destination site as on the source site

When data synced from the source to destination site is compressed, the compression ratio on the destination site might not be the same as on the source site. (BZ#1509266)

ceph log last does not display the exact number of specified lines

The ceph log last <number> command shows the specified number of lines from the cluster log and cluster audit log, by default located at /var/log/ceph/<cluster-name>/.log and /var/log/ceph/<cluster-name>.audit.log. Currently, the command does not display the exact number of specified lines. To work around this problem, use the tail -<number> <log-file> command. (BZ#1509374)

ceph-ansible does not properly check for running containers

In an environment where the Docker application is not preinstalled, the ceph-ansible utility fails to deploy a Ceph Storage Cluster because it tries to restart ceph-mgr containers when deploying the ceph-mon role. This attempt fails because the ceph-mgr container is not deployed yet. In addition, the docker ps command returns the following error:

either you don't have docker-client or docker-client-common installed

Because ceph-ansible only checks if the output of docker ps exists, and not its content, ceph-ansible misinterprets this result for a running container. When the ceph-ansible handler is run later during Monitor deployment, the script it executes fails because no ceph-mgr container is found.

To work around this problem, make sure that Docker is installed before using ceph-ansible. For details, see the Getting Docker in RHEL 7 section in the Getting Started with Containers guide for Red Hat Enterprise Linux Atomic Host 7. (BZ#1510555)

Object leaking can occur after using radosgw-admin bucket rm --purge-objects

In the Ceph Object Gateway, the radosgw-admin bucket rm --purge-objects command is supposed to remove all object from a bucket. However, in some cases, some of the objects are left in the bucket. This is caused by the RGWRados::gc_aio_operate() operation abandoning on shutdown. To work around this problem, remove the objects by using the rados rm command. (BZ#1514007)

The Red Hat Ceph Storage Dashboard cannot monitor iSCSI gateway nodes

The cephmetrics-ansible playbook does not install required Red Hat Ceph Storage Dashboard packages on iSCSI gateway nodes. As a consequence, the Red Hat Ceph Storage Dashboard cannot monitor the iSCSI gateways, and the "iSCSI Overview" dashboard is empty. (BZ#1515153)

Ansible fails to upgrade NFS Ganesha nodes

Ansible fails to upgrade NFS Ganesha nodes because the rolling-update.yml playbook searches for the /var/log/ganesha/ directory that does not exist. Consequently, the upgrading process terminates with the following error message:

"msg": "file (/var/log/ganesha) is absent, cannot continue"

To work around this problem, create /var/log/ganesha/ manually. (BZ#1518666)

The --limit mdss option does not create CephFS pools

When deploying the Metadata Server nodes by using the Ansible and the --limit mdss option, Ansible does not create the Ceph File System (CephFS) pools. To work around this problem, do not use --limit mdss. (BZ#1518696)

Manual and dynamic resharding sometimes hangs

In the Ceph Object Gateway (RGW), manual and dynamic resharding hangs on a bucket that has versioning enabled. (BZ#1535474)

Resharding a bucket that has ACLs set alters the bucket ACL

In the Ceph Object Gateway (RGW), resharding a bucket with access control list (ACL) set alters the bucket ACL. (BZ#1536795)

Rebooting all Ceph nodes simultaneously will cause an authentication error

When performing a simultaneous reboot of all the Ceph nodes in the storage cluster, a resulting client.admin authentication error will occur when issuing any Ceph-related commands from the command-line interface. To work around this issue, avoid rebooting all Ceph nodes simultaneously. (BZ#1544808)

Purging a containerized Ceph installation using NVMe disks fails

When attempting to purge a containerized Ceph installation using NVME disks, the purge fails because there are a few places where NVMe disk naming is not taken into account. (BZ#1547999)

When using the rolling_update.yml playbook to upgrade to Red Hat Ceph Storage 3.0 and from version 3.0 to other zStream releases of 3.0, users who use CephFS must manually upgrade the MDS cluster

Currently the Metadata Server (MDS) cluster does not have built-in versioning or file system flags to support seamless upgrades of the MDS nodes without potentially causing assertions or other faults due to incompatible messages or other functional differences. For this reason, it’s necessary during any cluster upgrade to reduce the number of active MDS nodes for a file system to one, first so that two active MDS nodes do not communicate with different versions. Further, it’s also necessary to take standbys offline as any new CompatSet flags will propagate via the MDSMap to all MDS nodes and cause older MDS nodes to suicide.

To upgrade the MDS cluster:

  1. Reduce the number of ranks to 1:

    ceph fs set <fs_name> max_mds 1
  2. Deactivate all non-zero ranks, from the highest rank to the lowest, while waiting for each MDS to finish stopping:

    ceph mds deactivate <fs_name>:<n>
    ceph status # wait for MDS to finish stopping
  3. Take all standbys offline using systemctl:

    systemctl stop ceph-mds.target
    ceph status # confirm only one MDS is online and is active
  4. Upgrade the single active MDS and restart daemon using systemctl:

    systemctl restart ceph-mds.target
  5. Upgrade and start the standby daemons.
  6. Restore the previous max_mds for your cluster:

    ceph fs set <fs_name> max_mds <old_max_mds>

For steps on how to upgrade the MDS cluster in a container, refer to the Updating Red Hat Ceph Storage deployed as a Container Image Knowledgebase article. (BZ#1550026)

Adding a new Ceph Manager node will fail when using the Ansible limit option

Adding a new Ceph Manager to an existing storage cluster using the Ansible limit option, tries to copy the Ceph Manager’s keyring without generating it first. This causes the Ansible playbook to fail and the new Ceph Manager node will not be configured properly. To workaround this issue, do not use the limit option while running the Ansible playbook. This will result in a newly generated keyring to be copied successfully. (BZ#1552210)

For Red Hat Ceph Storage deployments running within containers, adding a new OSD will cause the new OSD daemon to continuously restart

Adding a new OSD to an existing Ceph Storage Cluster running within a container, will restart the new OSD daemon every 5 minutes. As a result, the storage cluster will not achieve a HEALTH_OK state. Currently, there is no workaround for this issue. This does not affect already running OSD daemons. (BZ#1552699)

Reducing the number of active MDS daemons on CephFS can cause kernel clients I/O to hang

Reducing the number of active Metadata Server (MDS) daemons on a Ceph File System (CephFS) may cause kernel clients I/O to hang. If this happens, kernel clients are unable to connect MDS ranks greater than or equal to max_mds. To workaround this issue, raise max_mds to be greater than the highest rank. (BZ#1559749)

Adding iSCSI gateways using the gwcli tool returns an error

Attempting to add an iSCSI gateway using the gwcli tool returns the error:

package validation checks - OS version is unsupported

To work around this issue, add iSCSI gateways with the parameter skipchecks=true. (BZ#1561415)

Initiating the ceph-ansible playbook to expand the cluster sometimes fails on nodes with NVMe disks

When osd_auto_discovery is set to true, initiating the ceph-ansible playbook to expand the cluster causes the playbook to fail on nodes with NVMe disks because it is trying to reconfigure disks that are already being used by existing OSDs. This makes it impossible to add a new daemon collocating with an existing ODS that uses NVMe disks when osd_auto_discovery is set to true. To workaround this issue, configure a new daemon on a new node for which osd_auto_discovery is not set to true, and use the --limit parameter when initiating the playbook to expand the cluster. (BZ#1561438)

shrink-osd playbook cannot shrink some OSDs

The shrink-osd Ansible playbook does not support shrinking OSDs backed by an NVMe drive. (BZ#1561456)

tcmu-runner sometimes logs error messages

The tcmu-runner might sporadically log messages such as Async lock drop or Could not break lock. These logs can be ignored if they are not repeating more often than one time per hour. If the messages occur often, this can be indicative of a network path issue between one or more iSCSI initiators and the iSCSI targets and should be investigated. (BZ#1564084)

Sometimes the shrink-mon Ansible playbook fails to remove a monitor from the monmap

The shrink-mon Ansible playbook will sometimes fail to remove a monitor from the monmap even though the playbook completes its run successfully. The cluster status shows the monitor intended to be deleted as down. To workaround this issue, launch the shrink-mon playbook again with the intention of removing the same monitor, or remove the monitor from the monmap manually. (BZ#1564117)

It is not possible to expand a cluster when using the osd_scenario: lvm option

ceph-ansible is not idempotent when deploying OSDs using ceph-volume and the lvm_volumes config option. Therefor, if you deploy a cluster using the lvm osd_scenario option, then you will not be able to expand the cluster. To workaround this issue, remove existing OSDs from the lvm_volumes config option so that they will not try to be recreated when deploying new OSDs. Cluster expansion will succeed as expected and create the new OSDs. (BZ#1564214)

Upgrading a node in a Ceph cluster installed with ceph-test packages must have ceph_test = true in /etc/ansible/hosts file

When using the ceph-ansible rolling_update.yml playbook to upgrade a Ceph node in a RHEL cluster that was installed with ceph-test packages, set ceph_test = true in the /etc/ansible/hosts file for each node that has ceph-test package installed:

[mons]
mon_node1 ceph_test=true

[osds]
osd_node1 ceph_test=true

Not applicable for clients and MDS nodes. (BZ#1564232)

The shrink-osd.yml playbook currently has no support for removing OSDs created by ceph-volume

The shrink-osd.yml playbook assumes all OSDs are created by ceph-disk. As a result, OSDs deployed using ceph-volume cannot be shrunk. (BZ#1564444)

Increasing max_mds from 1 to 2 sometimes causes CephFS to be in degraded state

When increasing max_mds from 1 to 2, if the Metadata Server (MDS) daemon is in the starting/resolve state for a long period of time, then restarting the MDS daemon leads to assert. This causes the Ceph File System (CephFS) to be in degraded state. (BZ#1566016)

Mounting of nfs-ganesha file server on a client sometimes fails

Mounting of nfs-ganesha file server on a client fails with Connection Refused when a containerized IPv6 Red Hat Ceph Storage cluster with an nfs-ganesha-rgw daemon is deployed using the ceph-ansible playbook. I/Os are then unable to run. (BZ#1566082)

Client I/O sometimes fails for CephFS FUSE clients

Client I/O sometimes fails for Ceph File System (CephFS) as a File System in User Space (FUSE) clients with the error transport endpoint shutdown due to assert in the FUSE service. To workaround this issues, unmount and then remount CephFS FUSE, and then start the client I/Os. (BZ#1567030)

The DataDog monitoring utility returns "HEALTH_WARN" even though the cluster is healthy

The DataDog monitoring utility uses the overall_status field to determine the health of a cluster. However, overall_status is deprecated in Red Hat Ceph Storage 3.0 in favor of the status field and therefore always returns the HEALTH_WARN error message. Consequently, DataDog reports HEALTH_WARN even in cases when the cluster is healthy.