Chapter 4. Bug fixes

This section describes bugs with significant impact on users that were fixed in this release of Red Hat Ceph Storage. In addition, the section includes descriptions of fixed known issues found in previous versions.

4.1. The Cephadm utility

The PID limit is removed and workloads in the container no longer crash

Previously, in Red Hat Enterprise Linux 9 deployments, pid limits were enforced which limited the number of processes able to run inside the container. Due to this, certain operations, such as Ceph Object Gateway sync, would crash.

With this fix, the pid limit is set to unlimited on all Ceph containers, preventing the workloads in the container from crashing.

(BZ#2165644)

Cephadm no longer randomly temporarily removes config and keyring files

Previously, due to incorrect timing on when to calculate the client conf and keyrings, cephadm would calculate that there should be no config and keyrings placed on any host and would subsequently remove all of them.

With this fix, the timing of the calculation is changed to guarantee up-to-date information for the calculation. Cephadm no longer randomly, temporarily removes config and keyring files it is managing.

(BZ#2125002)

The Ceph Object Gateway daemons now bind to loopback addresses correctly

Previously, cephadm excluded loopback interfaces when looking for a valid IP address on a host to bind the Ceph Object Gateway daemon, thereby the daemons would not bind to the loopback addresses.

With this fix, the Ceph Object Gateway daemons can be bound to loopback addresses by performing an explicit check. If a loopback interface is detected, the 127.0.0.1 address is used for IPv4 and ::1 is used for IPv6 as the loopback address.

(BZ#2018245)

Cephadm now splits the device information into multiple entries after exceeding the Ceph monitor store size limit

Previously, cephadm was unable to refresh the hosts and complete most operations when the device information exceeded the monitor store default maximum size limit of 64K. This caused an entry size error. As a result, users had to raise the default limit if they had hosts with a large number of disks.

With this fix, cephadm now splits the device information into multiple entries if it takes more space than the size limit. Users no longer have to raise the monitor store entry size limit if they have hosts with a large number of disks.

(BZ#2053276)

Crash daemon now correctly records crash events and reports them to the storage cluster

Previously, crash daemon would not authenticate properly when sending crash reports to the storage cluster, which caused it to not correctly record crash events to send to the cluster.

With this fix, crash daemon now properly uses its authentication information when sending crash reports. It now correctly records crash events and reports them to the cluster.

(BZ#2062989)

Log rotation of the cephadm.log should no longer cause issues

Previously, the logrotate command would cause issues if the /var/log/ceph directory was created by something other than cephadm, for example ceph-common or ceph-ansible. As a consequence, the cephadm.log could not be rotated.

With this fix, su root root was added to the logrotate configuration to rotate as a root user. The logrotate command no longer causes an issue with the ownership of var/log/ceph directory, therefore the cephadm.log is rotated as expected.

(BZ#2099670)

Cephadm logging configurations are updated.

Previously, cephadm scripts were logging all output to stderr. As a result, cephadm bootstrap logs signifying successful deployment were also being sent to stderr instead of stdout.

With this fix, cephadm script now has different logging configurations for certain commands and the one used for bootstrap now only logs errors to stderr.

(BZ#2103707)

The network check no longer causes the hosts to be excluded from the monitor network

Previously, the network check would fail because cephadm would look for the exact match between the host network and some of the configured public networks. This caused the hosts with valid network configuration, which are the hosts with an interface that belonged to public_network, to be excluded from the monitor network.

With this fix, instead of looking for an exact match, it checks if the host network overlaps with any configured public networks, therefore valid hosts are no longer excluded from the monitor network.

(BZ#2104947)

cephadm no longer removes osd_memory_target config settings at host level

Previously, if osd_memory_target_autotune was turned off globally, cephadm would remove the values that the user set for osd_memory_target at the host level. Additionally, for hosts with FQDN name, even though the crush map uses a short name, cephadm would still set the configuration option using the FQDN. Due to this, users could not manually set osd_memory_target at the host level and osd_memory_target auto tuning would not work with FQDN hosts.

With this fix, the osd_memory_target config settings is not removed from cephadm at the host level if osd_memory_target_autotune is set to false. It also always uses a short name for hosts when setting host level osd_memory_target. If at the host level osd_memory_target_autotune is set to false, users can manually set the osd_memory_target and have the options not be removed by cephadm. Additionally, autotuning should now work with hosts added to cephadm with FQDN names.

(BZ#2107849)

cephadm rewrites Ceph OSD configuration files

Previously, while redeploying OSDs, cephadm would not write configuration used for Ceph OSDs, thereby the OSDs would not get the updated monitor configuration in its configuration file when the Ceph Monitor daemons were either added or removed.

With this fix, cephadm rewrites the configuration files automatically when redeploying OSDs and the OSD configuration files get updated to show the new location of the monitors when the monitors are added or deleted without user intervention.

(BZ#2111525)

Users can now drain hosts that are listed in explicit placements

Previously, draining hosts that were listed as part of an explicit placement would cause the hosts not to be properly drained and tracebacks would be logged until the drain was stopped or hosts were removed from any explicit placement .

With this fix, the handling of explicit placements is implemented internally and cephadm is able to determine if it needs to remove daemons from the hosts. Consequently, users can now drain hosts that are listed as part of an explicit placement without having to first remove the host from the placement.

However, users still need to remove the host from any explicit placement before removing the host fully or specifications that explicitly list the host cannot be applied.

(BZ#2112768)

cephadm returns non-zero code when --apply-spec option fails during bootstrap

Previously, cephadm bootstrap always returned code 0 if the operation was complete. If there were any failures in the deployment using the --apply-spec option, it would not reflect any failures in the return code.

With this fix, cephadm returns a non-zero value when applying specification fails during bootstrap.

(BZ#2116689)

Complex OSD deployment or replacement with shared DB devices now does not need to be done all at once

Previously, devices already used as db devices for previous OSDs were filtered out as unavailable devices, when cephadm created OSDs. As a result, complex OSD deployment where all the OSDs that were meant to use a device as their DB, but were not deployed at once, would not work as the DB device would be filtered out when creating subsequent OSDs even though they should not be according to the OSD specification.

With this fix, complex OSD deployment with shared DB devices now does not need to be done all at once. If users update an OSD specification to include additional data devices to be paired up with the already listed db devices in the specifications, cephadm should be able to create these new OSDs.

(BZ#2119715)

Proper errors are raised if invalid tuned-profile specification is detected by cephadm

Previously, cephadm would not validate YAML specification for tuned-profile and consequently would not return any error or warning while applying invalid and missing data in invalid tuned-profile specification.

With this fix, several checks are added to validate the tuned-profile specifications. Proper errors are now raised when invalid tuned-profile specification is detected by cephadm:

  • Invalid tunable is mentioned under “settings” in YAML specification.
  • “settings” section in YAML specification is empty.
  • Invalid placement is detected.

(BZ#2123609)

4.2. Ceph Dashboard

The host device Life Expectancy column now shows the correct value on the Ceph Dashboard

Previously, the host device Life Expectancy column would show an empty value because the column had no default value.

With this fix, the default value is assigned to the host device Life Expectancy column, and the column now shows the correct value.

(BZ#2021762)

Users are now able to change the Ceph Object Gateway subuser permissions

Previously, the users could not change the Ceph Object Gateway subuser permissions as the request was not implemented properly.

With this fix, the request to edit Ceph Object Gateway subuser permissions is implemented properly, therefore the user can now change the subuser permissions.

(BZ#2042888)

The overall performance graphs of the pools show correct values on the Ceph Dashboard

Previously, there was an issue in the query related to the Ceph Dashboard pools’ overall performance graphs and showed multiple entries of the same pool in the pool overview.

With this fix, the related query is fixed and overall performance graphs of the pools show correct values.

(BZ#2062085)

The Ceph Dashboard is now aligned with the command line interface’s (CLI) way of NFS exports creation

Previously, the squash field for the NFS exports creation was shown as a mandatory field in the edit form. Additionally, if exports were created from the backend and specified a different kind of squash name, the form would return an empty field.

With this fix, the required condition is removed from the squash field and the issue with the squash field coming up as empty in the edit form is also resolved.

(BZ#2062456)

Pool count shows correct values on the Ceph Dashboard

Previously, there was an issue in the query related to pool count. In the Overall performance tab, the Ceph Dashboard showed pool count as fractional value.

With this fix, the related query is fixed and pool count shows correct values on the Ceph Dashboard.

(BZ#2062590)

Validation is required when creating a new service name

Previously, there was no validation when creating a new service on the Ceph Dashboard, as a result, users were allowed to create a new service with an existing name. This would overwrite existing services and cause the user to lose a current running service on the hosts.

With this fix, validation is required before creating a new service on the dashboard and using an existing service name is not possible.

(BZ#2064850)

External snapshot creation in Grafana now disabled by default

Previously, creating external Grafana snapshots would generate broken links. This would make the infrastructure vulnerable to DDoS attacks ,as someone could gain insights into the environment by looking at the metric patterns.

With this fix, external Grafana snapshots are disabled and removed from the dashboard share options.

(BZ#2079847)

The services can now be safely put to the unmanaged mode on the Ceph Dashboard

Previously, when a user tried to create or modify the services, such as ingress or SNMP, in the unmanaged mode, the form would return a 500 error message and would fail to create the services. This happened because the form would not show some fields that needed to be filled even if the service was going directly to unmanaged.

With this fix, the form now shows the necessary fields and the validation is improved as well, therefore, all services can now be safely put to the unmanaged mode.

(BZ#2080316)

The Ceph Dashboard now securely connects with the hostname instead of IP address

Previously, an attempt to access the Object Gateway section of the Ceph Dashboard would throw a 500 - internal server error. This error was as a result of the Ceph Dashboard trying to establish the HTTPS connection to the Ceph Object Gateway daemons by IP address instead of hostname, which requires that the hostname of the server matches the hostname in the TLS certificate.

With this fix, the Ceph Dashboard can now correctly establish a HTTPS connection and successfully connect to the Object gateways using the hostname.

(BZ#2080485)

ceph-dashboard now prompts users when creating an ingress service

Previously, ingress service creation would fail with a 500 internal server error, when the form was submitted without specifying frontend and monitor port values. As a result, an ingress service could not be created from the Ceph Dashboard.

With this fix, ceph-dashboard prompts users to fill in all the mandatory fields, when creating an ingress service on the Ceph Dashboard.

(BZ#2080916)

The service instance column of the host table on the Ceph Dashboard now shows all the services deployed on the particular host

Previously, the service instance column of ClusterHosts table only showed ceph services and not cephadm services as the frontend was lacking subscription to some of the services.

With this fix, the service instance column now shows all the services deployed on the particular host in the host table on the Ceph Dashboard.

(BZ#2101771)

The Ceph Dashboard now raises appropriate error message, if user tries to create a snapshot with an existing name

Previously, Ceph Dashboard would not validate Ceph File System snapshot creation with an existing name and would throw a 500 - internal server error.

With this fix, the correct error message is added, which throws an appropriate error message when user creates a snapshot with an existing name.

(BZ#2111650)

Ceph node "network packet" drop alerts are shown appropriately on the dashboard

Previously, there was an issue in the query related to Ceph node "network packet" drop alerts. As a consequence, those alerts would be seen frequently on the Ceph Dashboard.

With this fix, related query no longer causes the issues and Ceph node Network Packet drop alerts are shown appropriately.

(BZ#2125433)

4.3. Ceph File System

MDS daemon now resets the heartbeat in each thread after each queued work

Previously, a thread would hold the mds_lock for a longtime if it had a lot of work to do. This caused other threads to be starved of resources and be stuck for a longtime, as a result MDS daemon would fail to report the heartbeat to monitor in time and be kicked out of the cluster.

With this fix, the MDS daemon resets the heartbeat in each thread after each queued work.

(BZ#2060989)

Ceph Metadata Server no longer crashes during concurrent lookup and unlink operations

Previously, an incorrect assumption of an assertion placed in the code, would hit the concurrent lookup and unlink operations from a Ceph client, causing Ceph Metadata Server crash.

With this fix, the assertion is moved to the relevant place where the assumption, during concurrent lookup and unlink operation, is valid, resulting in the continuation of Ceph Metadata Server serving the Ceph client operations without crashing.

(BZ#2074162)

A replica MDS is no longer stuck, if a client sends a getattr client request just after it was created

Previously, if a client sent a getattr client request just after the replica MDS was created, the client would make a path of #INODE-NUMBER because the CInode was not linked yet. The replica MDS would keep retrying until the auth MDS flushed the mdlog and the C_MDS_openc_finish and link_primary_inode were called 5 seconds later at most.

With this fix, the replica MDS trying to find the CInode from auth MDS would manually trigger mdslog flush, if it could not find it.

(BZ#2091491)

Ceph File System subvolume groups created by the user are now displayed when listing subvolume groups

Previously, the Ceph File System (CephFS) subvolume groups listing included CephFS internal groups instead of CephFS subvolume groups created by users.

With this fix, the internal groups are filtered from CephFS subvolume group list. As a result, CephFS subvolume groups created by the user are displayed now.

(BZ#2093258)

Saved snap-schedules are reloaded from Ceph storage

Previously, restarting Ceph Managers caused retention policy specifications to get lost because it was not saved to the Ceph storage. As a consequence, retention would stop working.

With this fix, all changes to snap-schedules are now persisted to the Ceph storage, therefore when Ceph Managers are restarted, the saved snap-schedule is reloaded from the Ceph storage and restarted with specified retention policy specifications.

(BZ#2125773)

The API for deleting RADOS objects is updated

Previously, deleting a RADOS object would result in the program crash and would create tracebacks in the logs.

With this fix, the API is updated to correctly remove the RADOS object after an upgrade and no stack traces are dumped in logs.

(BZ#2126269)

MDS now stores all damaged dentries

Previously, metadata servers (MDS) would only store dentry damage for a dirfrag if dentry damage would not already exist in that dirfrag. As a result, only the first damaged dentry would be stored in the damage table and subsequent damage in the dirfrag would be forgotten.

With this fix, MDS can now properly store all the damaged dentries.

(BZ#2129968)

The ceph-mds daemon no longer crashes during the upgrade

Previously, the Ceph Metadata Server daemons (ceph-mds) would crash during an upgrade due to an incorrect assumption in the Metadata Servers when recovering inodes. It caused ceph-mds to hit an assert during an upgrade.

With this fix, the ceph-mds makes correct assumptions during inode recovery and the ceph-mds no longer crashes during an upgrade.

(BZ#2130081)

The standby-replay Metadata Server daemon is no longer unexpectedly removed

Previously, the Ceph Monitor would remove a standby-replay Metadata Server (MDS) daemon from the MDS map under certain conditions. This would cause the standby-replay MDS daemon to get removed from the Metadata Server cluster, which generated cluster warnings.

With this fix, the logic used in Ceph Monitors during the consideration of removal of an MDS daemon from the MDS map now includes information about the standby-replay MDS daemons holding a rank. As a consequence, the standby-replay MDS daemons are no longer unexpectedly removed from the MDS cluster.

(BZ#2130118)

The subvolume snapshot info command no longer has the size field in the output

Previously, the output of the subvolume snapshot command would return an incorrect snapshot size. This was due to the fact that the snapshot info command relies on rstats to track the snapshot size. The rstats tracks the size of the snapshot from its corresponding subvolume instead of the snapshot itself.

With this fix, the size field is removed from the output of the snapshot info command until the rstats is fixed.

(BZ#2130422)

The disk full scenario does not corrupt the configuration file anymore

Previously, the configuration files were being written directly to the disk without using the temporary files, which involved truncating the existing configuration file and writing the configuration data. This led to the empty configuration files when the disk was full as the truncate was successful, however writing new configuration data failed with no space error. Additionally, it led to the failure of all the operations on corresponding subvolumes.

With this fix, the configuration data is written to a temporary configuration file and renamed to the original configuration file and prevents truncating the original configuration file.

(BZ#2130450)

Do not abort MDS in case of unknown messages

Previously, metadata servers (MDS) would abort if it received a message that it did not understand. As a result, any malicious client would crash the server by just sending a message of a new type to the server. Beside malicious clients, this also meant that whenever there was a protocol issue, such as a new client erroneously sending new messages to the server, the whole system would crash instead of just the new client.

With this fix, MDS no longer aborts if it receives an unknown request from a client, instead it closes the session, blocklists, and evicts the client. This protects the MDS and the whole system from any intentional attacks like the denial of service from any malicious clients.

(BZ#2130984)

Directory listing from a NFS client now works as expected for NFS-Ganesha exports

Previously, Ceph File System (CephFS) Metadata Server (MDS) would not increment the change attribute, (change_attr) of a directory inode during CephFS operations which only changed the directory inode’s ctime. Therefore, an NFS kernel client would not invalidate its readdir cache when it is supposed to. This is because the NFS Ganesha server backed by CephFS would sometimes report incorrect change attribute value of the directory inode. As a result, the NFS client would list stale directory contents for NFS Ganesha exports backed by CephFS.

With this fix, CephFS MDS now increments the change attribute of the directory inode during operations and the directory listing from the NFS client now works as expected for NFS Ganesha server exports backed by CephFS.

(BZ#2135573)

The CephFS now has the correct directory access

Previously, directory access was denied even to the UID of 0 due to incorrect Discretionary Access Control (DAC) management.

With this fix, directory access is allowed to UID 0 even if the actual permissions for the directory user, group, and others are not permissible for UID 0. This results in the correct Ceph File System (CephFS) behavior for directory access to UID 0 by effectively granting superuser privileges.

(BZ#2147460)

4.4. The Ceph Volume utility

The ceph-volume inventory command no longer fails

Previously, when a physical volume was not a member of any volume group, ceph-volume would not ignore the volume, and instead would try to process it, which caused the ceph-volume inventory command to fail.

With this fix, the physical volumes that are not a member of a volume group are filtered and the ceph-volume inventory command no longer fails.

(BZ#2140000)

4.5. Ceph Object Gateway

Users can now use MD5 for non-cryptographic purposes in a FIPS environment

Previously, in a FIPS enabled environment, the usage of MD5 digest was not allowed by default, unless explicitly excluded for non-cryptographic purposes. Due to this, a segfault occurred during the S3 complete multipart upload operation.

With this fix, the usage of MD5 for non-cryptographic purposes in a FIPS environment for S3 complete multipart PUT operations is explicitly allowed and the S3 multipart operations can be completed.

(BZ#2088571)

The Ceph Object Gateway no longer crashes on accesses

Previously, the Ceph Object Gateway would crash on some access due to the changes from in-place to allocated buckets as a malformed bucket URL caused a void pointer dereference to a bucket value that was not always initialized.

With this fix, the Ceph Object Gateway properly checks that the pointer is non-null before doing permission checks and throws an error if it is not initialized.

(BZ#2118423)

The code that parses dates z-amz-date format is changed

Previously, the standard format for x-amz-date was changed which caused issues, since the new software uses the new date format. The new software built with the latest go libraries would not talk to the Ceph Object Gateway.

With this fix, the code in the Ceph Object Gateway that parses dates in x-amz-date format is changed to also accept the new date format.

(BZ#2121564)

Ceph Object Gateway’s Swift implicit tenant behavior is restored

Previously, a change to Swift tenant parsing caused the failure of Ceph Object Gateway’s Swift implicit tenant processing.

With this fix, Swift tenant parsing logic is corrected and the Swift implicit tenant behavior is restored.

(BZ#2123177)

The Ceph Object Gateway no longer crashes after running continuously for an extended period

Previously, an index into a table would become negative after running continuously for an extended period, resulting in the Ceph Object Gateway crash.

With this fix, the index is not allowed to become negative and the Ceph Object Gateway no longer crashes.

(BZ#2155894)

Variable access no longer causes undefined program behavior

Previously, a coverity scan would identify two cases, where variables could be used after a move, potentially causing an undefined program behavior to occur.

With this fix, variable access is fixed and the potential fault can no longer occur.

(BZ#2155916)

4.6. Multi-site Ceph Object Gateway

Roles and role policy now transparently replicate when multi-site is configured

Previously, the logic to replicate S3 roles and role policy was not implemented in Ceph Object Gateway, thereby, roles and role policy created in any zone in a multi-site replicated setup were not transparently replicated to other zones.

With this fix, role and role policy replication are implemented and they transparently replicate when multi-site is configured.

(BZ#2136771)

4.7. RADOS

Slow progress and high CPU utilization during backfill is resolved

Previously, the worker thread with the smallest index in an OSD shard would return to the main worker loop, instead of waiting until an item could be scheduled from the mClock queue or until notified. This resulted in the busy loop and high CPU utilization.

With this fix, the worker thread with the smallest thread index reacquires the appropriate lock and waits until notified, or until time period lapses as indicated by the mClock scheduler. The worker thread now waits until an item can be scheduled from the mClock queue or until notified and then returns to the main worker loop thereby eliminating the busy loop and solving the high CPU utilization issue.

(BZ#2114612)

Renaming large objects no longer fails when using temporary credentials returned by STS

Previously, due to incorrect permission evaluation of iam policies while renaming large objects, renaming large objects would fail when temporary credentials returned by STS were used.

With this fix, iam policies are correctly evaluated when temporary credentials returned by STS are used to rename large objects.

(BZ#2166572)

The small writes are deferred

Previously, Ceph would defer writes while allocating units. When the allocation unit was large, like 64 K, no small write was eligible for deferring.

With this update, the small writes are deferred as they operate on disk blocks even when large allocation units are deferring.

(BZ#2107406)

The Ceph Monitor no longer crashes after reducing the number of monitors

Previously, when the user reduced the number of monitors in the quorum using the ceph orch apply mon NUMBER command, cephadm would remove the monitor before shutting it down. This would trigger an assertion because Ceph would assume that the monitor is shutting down before the monitor removal.

With this fix, a sanity check is added to handle the case when the current rank of the monitor is larger or equal to the quorum rank. The monitor no longer exists in the monitor map, therefore its peers do not ping this monitor, because the address no longer exists. As a result, the assertion is not triggered if the monitor is removed before shutdown.

(BZ#1945266)

The Ceph Manager checks that deals with the initial service map is now relaxed

Previously, when upgrading a cluster, the Ceph Manager would receive several service_map versions from the previously active Ceph manager. This caused the manager daemon to crash, due to an incorrect check in the code, when the newly activated manager received a map with a higher version sent by the previously active manager.

With this fix, the check in the Ceph Manager that deals with the initial service map is relaxed to correctly check service maps and no assertion occurs during the Ceph Manager failover.

(BZ#1984881)

The ceph --help command now shows that yaml formatters are only valid for ceph orch commands

Previously, it was implied by lack of specification in the ceph --help command that the yaml formatter option was valid for any ceph command, including the ceph config dump command.

With this fix, the output of the ceph --help command shows that yaml formatters are only valid for the ceph orch commands.

(BZ#2040709)

Corrupted dups entries of a PG Log can be removed by off-line and on-line trimming

Previously, trimming of PG log dups entries could be prevented during the low-level PG split operation, which is used by the PG autoscaler with far higher frequency than by a human operator. Stalling the trimming of dups resulted in significant memory growth of PG log, leading to OSD crashes as it ran out of memory. Restarting an OSD would not solve the problem as the PG log is stored on disk and reloaded to RAM on startup.

With this fix, both off-line, using the ceph-objectstore-tool command, and on-line, within OSD, trimming can remove corrupted dups entries of a PG log that jammed the on-line trimming machinery and were responsible for the memory growth. A debug improvement is implemented that prints the number of dups entries to the OSD’s log to help future investigations.

(BZ#2093106)

The starts message is added to notify that scrub or deep-scrub process has begun

Previously, users were not able to determine when the scrubbing process started for a placement group (PG) because the starts message was missing from the cluster log. This made it difficult to calculate the time taken to scrub or deep-scrub a PG.

With this fix, the scrub starts or the deep-scrub starts message appears to notify the user that scrub or deep-scrub process has begun for the PG.

(BZ#2094955)

The autoscale-status command no longer displays the NEW PG_NUM value if PG-autoscaling is disabled

Previously, the autoscale-status command would show the NEW PG_NUM value even though PG-autoscaling was not enabled. This would mislead the end user by suggesting that PG autoscaler applied NEW PG_NUM value to the pool, which was not the case.

With this fix, if the noautoscale flag is set, the NEW PG_NUM value is not shown in the autoscale-status command output.

(BZ#2099193)

Users can remove cloned objects after upgrading a cluster

Previously, after upgrading a cluster from Red Hat Ceph Storage 4 to Red Hat Ceph Storage 5 , removing snapshots of objects created in earlier versions would leave clones, which could not be removed. This was because the SnapMapper key’s were wrongly converted.

With this fix, SnapMapper’s legacy conversation was updated to match the new key format and cloned objects in earlier versions of Ceph can now be easily removed after an upgrade.

(BZ#2107404)

The ceph daemon heap stats command now returns required usage details for daemon

Previously, the ceph daemon osd.x heap stats command would return empty output instead of the current heap usage for a Ceph daemon. Consequently, users were compelled to use the ceph tell heap stats command to get the desired heap usage.

With this fix, the ceph daemon heap stats command returns heap usage information for a Ceph daemon similar to what we get using ceph tell command.

(BZ#2119101)

The Prometheus metrics now reflect the correct Ceph version for all Ceph Monitors whenever requested

Previously, the Prometheus metrics reported mismatched Ceph versions for Ceph Monitors when the monitor was upgraded. As a result, the active Ceph Manager daemon needed to be restarted to resolve this inconsistency. Additionally, the Ceph manager used to update the monitor metadata through the handle_mon_map parameter, which gets triggered when the monitors are removed or added from the cluster or the active mgr is restarted or mgr failsover.

With this fix, instead of relying on fetching mon metadata using ceph mon metadata command, MON now explicitly sends metadata update requests with mon metadata to mgr.

(BZ#2121265)

The correct set of replicas are used for remapped placement groups

Previously, for remapped placement groups, the wrong set of replicas would be queried for the scrub information causing a failure of the scrub process, after identifying mismatches that would not exist.

With this fix, the correct set of replicas are now queried.

(BZ#2130666)

The targeted rank_removed no longer gets stuck in live_pinging and dead_pinging states

Previously, in some cases, the paxos_size of the Monitor Map would get updated before the rank of the monitor was changed. For example, paxos_size would get reduced from 5 to 4, but the highest rank of the Monitors was still 4, thus the old code would skip deleting the rank from dead_pinging state. This would cause the targeted rank to remain in dead_pinging forever, which would then cause strange peer_tracker scores in election strategy: 3.

With this fix, a case is added when rank_removed == paxos_size() that erases the targeted rank_removed from both the live_pinging and dead_pinging states and the rank does not get stuck forever in either of these sets.

(BZ#2142143)

Ceph Monitors are not stuck during failover of a site

Previously, the removed_ranks variable would not discard its content for every update of the Monitor map. Thus it would replace monitors in a 2-site stretch cluster and fail over of one of the site would cause connection scores, including ranks associated with the scores, to be inconsistent.

Inconsistent connection scores would cause deadlock during the monitor election period, which would result in Ceph to become unresponsive. Once this happened, there was no way for the monitor rank associated with the connection score to correct itself.

With this fix, the removed_ranks variable gets cleared with every update of the monitor map. Monitors are no longer stuck in the election period and Ceph no longer becomes unresponsive when replacing monitors and failing over a site. Moreover, there is a way to manually force the connection scores to correct themselves with the ceph daemon mon.NAME connection scores reset command.

(BZ#2142983)

Users are now able to set the replica size to 1

Previously, users were unable to set the pool size to 1. The check_pg_num() function would incorrectly calculate the projected placement group number of the pool, which resulted in an underflow. Because of the false result, it appeared that the pg_num was larger than the maximum limit.

With this fix, the recent check_pg_num() function edits are reverted and the calculation is now working properly without resulting in an underflow and the users are now able to set the replica size to 1.

(BZ#2153654)

Ceph cluster issues a health warning if the require-osd-release flag is not set to the appropriate release after a cluster upgrade.

Previously, the logic in the code that detects the require-osd-release flag mismatch after an upgrade was inadvertently removed during a code refactoring effort. Since the warning was not raised in the ceph -s output post an upgrade, any change made to the cluster without setting the flag to the appropriate release resulted in issues, such as, placement groups (PG) stuck in certain states, excessive Ceph process memory consumption, slow requests, among many other issues.

With this fix, the Ceph cluster issues a health warning if the require-osd-release flag is not set to the appropriate release after a cluster upgrade.

(BZ#1988773)

4.8. NFS Ganesha

The NFS-Ganesha package is based on 4.0.8 version

With this release, the nfs-ganesha package is now based on the upstream version 4.0.8, which provides a number of bug fixes and enhancements from the previous version.

(BZ#2121236)