Cluster upgrade stuck as monitoring operator is in degraded state with prometheus-k8s pod in CrashLoopBackOff state
Issue
-
Upgrade is stuck due to the monitoring cluster operator being degraded with the below message:
Failed to rollout the stack. Error: updating prometheus-k8s: waiting for Prometheus object changes failed: waiting for Prometheus openshift-monitoring/k8s: expected 2 replicas, got 0 updated replicas -
Prometheus-k8s pod is in CrashLoopBackOff state with the below log message:
$ oc logs prometheus-k8s-1 -c prometheus caller=main.go:1081 level=error err="opening storage failed: repair corrupted WAL: cannot handle error: open WAL segment: 16033: open /prometheus/wal/00016033: disk quota exceeded"
Environment
- Red Hat OpenShift Container Platform 4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.