CephFS MDSs are in CLBO due to rook-ceph-mgr-a being Unable to Perform Delete Ops - OpenShift Data Foundation (ODF)
Issue
Both MDSs enter an Error/CrashLoopBackOff state due to a failed rook-ceph-mgr-a delete operation. Most likely due to a database workload.
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-9b68f854zbw9z 1/2 Running 5 5m30s
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-c7c87476b7gq6 1/2 Running 5 5m30s
name: mds
ready: false
restartCount: 4
started: false
state:
waiting:
message: back-off 1m20s restarting failed container=mds pod=rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-9b68f854zbw9z_openshift-storage(10c20d44-014b-4b62-b9e2-83df4a6d2177)
reason: CrashLoopBackOff
Environment
Red Hat OpenShift Data Foundation (RHODF) v4.x
Red Hat OpenShift Container Storage (RHOCS) v4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.