Alert manager shows all targets are down in openshift-monitoring project

Solution Verified - Updated -

Issue

  • Alert manager and Prometheus shows all targets are down in openshift-monitoring project even though every pods are running.
Alerts: 
ClusterMonitoringOperatorDown (1 active)
AlertmanagerDown (1 active)
ClusterMonitoringOperatorDown (1 active)
KubeAPIDown (1 active)
KubeControllerManagerDown (1 active)
KubeSchedulerDown (1 active)
KubeStateMetricsDown (1 active)
NodeExporterDown (1 active)
KubeletDown (1 active)
PrometheusDown (1 active)
PrometheusOperatorDown (1 active)

Pod's status:
$ oc get pods -n openshift-monitoring
NAME                                           READY     STATUS    RESTARTS   AGE
alertmanager-main-0                            3/3       Running   0          21d
alertmanager-main-1                            3/3       Running   0          21d
alertmanager-main-2                            3/3       Running   0          21d
cluster-monitoring-operator-54f68d68db-sqrxr   1/1       Running   0          21d
grafana-859cf9cc4-v97q2                        2/2       Running   0          8d
kube-state-metrics-6785bdd8cb-cszk5            3/3       Running   0          21d
node-exporter-h8xgd                            2/2       Running   0          21d
node-exporter-l96z8                            2/2       Running   0          21d
node-exporter-lwz2z                            2/2       Running   0          21d
node-exporter-mcl8p                            2/2       Running   0          21d
node-exporter-nf8k8                            2/2       Running   0          21d
node-exporter-rhxlt                            2/2       Running   0          21d
node-exporter-s7h2c                            2/2       Running   0          21d
node-exporter-xt46n                            2/2       Running   0          21d
node-exporter-xzxvf                            2/2       Running   0          21d
prometheus-k8s-0                               4/4       Running   0          8d
prometheus-k8s-1                               4/4       Running   0          8d
prometheus-operator-dbb697d4f-l6tm2            1/1       Running   0          21d
  • Prometheus shows error: WAL log samples: log series: write /prometheus/wal/003772: transport endpoint is not connected.

Environment

  • Red Hat OpenShift Container Platform 3.x
    • 3.11.232
  • Red Hat Gluster Storage

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content