Cluster upgrade failed as monitoring operator is in degraded state with alertmanager-main-1 and alertmanager-main-2 pods crashlooing.

Solution Verified - Updated -

Issue

  • Cluster upgrade failed as monitoring operator is in degraded state with alertmanager-main-1 and alertmanager-main-2 pods crashlooing.
# oc get pods -n openshift-monitoring
NAME                                           READY   STATUS             RESTARTS   AGE
alertmanager-main-0                            3/3     Running            0          50d
alertmanager-main-1                            2/3     CrashLoopBackOff   13         46m
alertmanager-main-2                            2/3     CrashLoopBackOff   16         58m
...
  • monitoring operator is degraded with below error:
Failed to rollout the stack. Error: running task Updating Alertmanager failed: waiting for Alertmanager object changes failed: waiting for Alertmanager: expected 3 replicas, updated 0 and available 1
  • alertmanager-main-1 pod is not starting with below error:
# oc logs alertmanager-main-1  -c alertmanager-proxy
2020/07/25 07:58:52 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9093: connect: connection refused
2020/07/25 07:58:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9093: connect: connection refused
2020/07/25 07:58:56 reverseproxy.go:437: http: proxy error: dial tcp [::1]:9093: connect: connection refused
...

Environment

  • Red Hat OpenShift Container Platform
    • Upgrade from 4.4.6 to 4.4.8

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In