Monitoring gets stuck (and/or duplicates PVCs) upgrading to 4.4 when using local storage

Solution Verified - Updated -

Issue

  • When having local-storage-operator configured and upgrading from 4.3.x to 4.4 (<= 4.4.8), the upgrade gets blocked at monitoring level with the following status, for example:
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.23    True        True          31m     Unable to apply 4.4.6: the cluster operator monitoring has not yet successfully rolled out

$ oc get pvc -n openshift-monitoring
NAME                                       STATUS    VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS    AGE
alertmanager-main-db-alertmanager-main-0   Pending                                                 local-storage   15m
alertmanager-main-db-alertmanager-main-1   Pending                                                 local-storage   15m
alertmanager-main-db-alertmanager-main-2   Pending                                                 local-storage   15m
localpvc-alertmanager-main-0               Bound     local-pv-68cdb92    100Gi      RWO            local-storage   8h
localpvc-alertmanager-main-1               Bound     local-pv-efbad35c   100Gi      RWO            local-storage   8h
localpvc-alertmanager-main-2               Bound     local-pv-98fe334e   100Gi      RWO            local-storage   8h
localpvc-prometheus-k8s-0                  Bound     local-pv-f06c680f   100Gi      RWO            local-storage   8h
localpvc-prometheus-k8s-1                  Bound     local-pv-c8e63d2b   100Gi      RWO            local-storage   8h
prometheus-k8s-db-prometheus-k8s-0         Pending                                                 local-storage   15m
prometheus-k8s-db-prometheus-k8s-1         Pending                                                 local-storage   15m

See Upgrade stuck section for this case.

  • When having local-storage-operator configured and upgrading from 4.4.x to 4.4.y (<= 4.4.8), the PVCs get duplicated and previous data is no longer available, for example:
$ oc get pvc -n openshift-monitoring
NAME                                         STATUS   VOLUME             CAPACITY   ACCESS MODES   STORAGECLASS    AGE
alertmanager-main-db-alertmanager-main-0     Bound    pvc-1cb2c2a5-xxx   20Gi       RWO            ocs-ceph-test   14d
alertmanager-main-db-alertmanager-main-1     Bound    pvc-4c8429fe-xxx   20Gi       RWO            ocs-ceph-test   14d
alertmanager-main-db-alertmanager-main-2     Bound    pvc-5c0f0c9d-xxx   20Gi       RWO            ocs-ceph-test   14d
ocs-alertmanager-claim-alertmanager-main-0   Bound    pvc-0ea759d4-xxx   20Gi       RWO            ocs-ceph-test   5d6h
ocs-alertmanager-claim-alertmanager-main-1   Bound    pvc-29a18e3f-xxx   20Gi       RWO            ocs-ceph-test   5d6h
ocs-alertmanager-claim-alertmanager-main-2   Bound    pvc-d8c70ddc-xxx   20Gi       RWO            ocs-ceph-test   5d6h
ocs-prometheus-claim-prometheus-k8s-0        Bound    pvc-a91ae6f1-xxx   100Gi      RWO            ocs-ceph-test   5d6h
ocs-prometheus-claim-prometheus-k8s-1        Bound    pvc-31bf1991-xxx   100Gi      RWO            ocs-ceph-test   5d6h
prometheus-k8s-db-prometheus-k8s-0           Bound    pvc-04f59982-xxx   100Gi      RWO            ocs-ceph-test   14d
prometheus-k8s-db-prometheus-k8s-1           Bound    pvc-a20075bb-xxx   100Gi      RWO            ocs-ceph-test   14d

See Duplicated PVCs section for this case.


NOTE: See Root Cause section for more details if needed.

Environment

  • OpenShift Container Platform
    • 4.3.x -> 4.4.x (<= 4.4.8)
    • 4.4.x -> 4.4.y (<= 4.4.8)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content