Alert manager shows all targets are down in openshift-monitoring project
Issue
- Alert manager and Prometheus shows all targets are down in openshift-monitoring project even though every pods are running.
Alerts:
ClusterMonitoringOperatorDown (1 active)
AlertmanagerDown (1 active)
ClusterMonitoringOperatorDown (1 active)
KubeAPIDown (1 active)
KubeControllerManagerDown (1 active)
KubeSchedulerDown (1 active)
KubeStateMetricsDown (1 active)
NodeExporterDown (1 active)
KubeletDown (1 active)
PrometheusDown (1 active)
PrometheusOperatorDown (1 active)
Pod's status:
$ oc get pods -n openshift-monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 3/3 Running 0 21d
alertmanager-main-1 3/3 Running 0 21d
alertmanager-main-2 3/3 Running 0 21d
cluster-monitoring-operator-54f68d68db-sqrxr 1/1 Running 0 21d
grafana-859cf9cc4-v97q2 2/2 Running 0 8d
kube-state-metrics-6785bdd8cb-cszk5 3/3 Running 0 21d
node-exporter-h8xgd 2/2 Running 0 21d
node-exporter-l96z8 2/2 Running 0 21d
node-exporter-lwz2z 2/2 Running 0 21d
node-exporter-mcl8p 2/2 Running 0 21d
node-exporter-nf8k8 2/2 Running 0 21d
node-exporter-rhxlt 2/2 Running 0 21d
node-exporter-s7h2c 2/2 Running 0 21d
node-exporter-xt46n 2/2 Running 0 21d
node-exporter-xzxvf 2/2 Running 0 21d
prometheus-k8s-0 4/4 Running 0 8d
prometheus-k8s-1 4/4 Running 0 8d
prometheus-operator-dbb697d4f-l6tm2 1/1 Running 0 21d
- Prometheus shows error: WAL log samples: log series: write /prometheus/wal/003772: transport endpoint is not connected.
Environment
- Red Hat OpenShift Container Platform 3.x
- 3.11.232
- Red Hat Gluster Storage
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.