Chapter 3. Bug fixes

This section describes notable bug fixes introduced in Red Hat OpenShift Container Storage 4.6.

MGR pod restarts even if the MONs are down

Previously, when the nodes restarted the MGR pod might get stuck in a pod initialisation state which resulted in the inability to create new persistent volumes (PVs). With this update, the MGR pod restarts even if the MONs are down.

(BZ#1990031)

Old OSD pods no longer stay in Terminating state after disk replacement

Old OSD pod sometimes stayed behind in Terminating state after the disk replacement procedure. With this update, if the rook-ceph-osd pod is in Terminating state, you can now use the force option to delete the pod.

MAX HPA value exceeding 1 no longer triggers an alert

In previous versions of Red Hat OpenShift Container Storage, the autoscaling feature for pods was not available. Therefore, the MAX HPA value could not be greater than 1, or an alert was triggered. With this update, this feature is enabled and the alert is no longer triggered.

(BZ#1836299)

ceph-mgr no longer causes errors during requests

Previously, certain ceph-mgr modules (fs) always connected to the MONs that were passed in as part of the initial ceph-mgr pod creation. Therefore, when the MON endpoints were changed, these modules failed to connect to the Red Hat Ceph Storage cluster for various requests, such as provisioning and staging CephFS volumes, causing errors. With this update, ceph-mgr has been fixed to keep its MON endpoints updated as they change, and to not rely only on the initial MON addresses that are passed during pod creation, and ceph-mgr operations continue and work as expected.

(BZ#1858195)

RGW endpoints no longer run in http mode

Previously, external python scripts always tried to see the given RGW endpoint, reachable or not, and TLS enabled https URLs were not supported. This compromised security because the user was forced to run RGW endpoints in http mode. The external script has been fixed to include https URLs reachability check.

(BZ#1878853)

Broken links removed from UI

Previously, there were broken links in the UI caused by outdated Multicloud Object Gateway documentation. These links have been removed, and all information is now covered in the OpenShift Container Storage documentation.

(BZ#1881398)

OpenShift Container Storage pods are no longer being scheduled on nodes not labeled for OpenShift Container Storage

Previously, some OpenShift Container Storage pods were being scheduled on nodes not labeled for OpenShift Container Storage that belonged on the labeled nodes. This fix adds the proper NodeAffinity to those pods, resolving that issue.

(BZ#1883828)

CSI driver and other resources no longer disappear unexpectedly

Previously, there was an invalid owner reference in Rook on the CSI driver. Because of this, OpenShift Container Platform would periodically incorrectly garbage collect the CSI driver and other resources in the openshift-storage namespace causing resources to disappear unexpectedly. The invalid owner reference in Rook to the CSI driver has been removed, and the CSI driver and other resources no longer disappear.

(BZ#1884318)

MON PDBs are now reconciled allowing node drains

Previously, the reconciler for MON PodDisruptionBudget was static. It would create the PDB only once based on the MON count, but would not update it if the MON count changed. In OpenShift Container Storage versions 4.3 and 4.4, the default MON count was increased to 5 when the cluster had 5 nodes. In OpenShift Container Storage versions 4.5 and later, the MON count was kept to 3, despite a different number of nodes. On upgrading from OpenShift Container Storage versions 4.3 and 4.4 to 4.5 or later, the ALLOWED DISRUPTIONS would become 0. This would not allow nodes to drain. With this update, the MON PDB reconciler now creates a new PDB for MONs if the MON count changes. Therefor, ALLOWED DISRUPTIONS will always be 1 and allow node drains.

(BZ#1888713)

No issues when setting public access policy to a bucket

Previously, there was a translation issue when setting the public access policy to a bucket and the desired policy would not be set correctly. This translation issue has been fixed, and the desired policy is set correctly so public access can be set.

(BZ#1889683)