kube-state-metrics and node-exporter TargetDown alerts are shown
Issue
Prometheus collects cluster metrics to monitor the cluster's state. If there is an issue, then Prometheus will send an alert to its AlertManager which will then alert the user.
In this case, the following alerts are visible in the AlertManager:
alertname = TargetDown
cluster = my-cluster
job = kube-state-metrics
prometheus = openshift-monitoring/k8s
severity = warning
description = 100% of kube-state-metrics targets are down.
summary = Targets are down
alertname = KubeStateMetricsDown
cluster = my-cluster
prometheus = openshift-monitoring/k8s
severity = critical
message = KubeStateMetrics has disappeared from Prometheus target discovery.
alertname = TargetDown
cluster = my-cluster
job = node-exporter
prometheus = openshift-monitoring/k8s
severity = warning
description = 100% of node-exporter targets are down.
summary = Targets are down
alertname = NodeExporterDown
cluster = my-cluster
prometheus = openshift-monitoring/k8s
severity = critical
message = NodeExporter has disappeared from Prometheus target discovery.
These alerts appear to indicate that the node-exporter and kube-state-metrics pods have gone down in the cluster. However, the output of oc get pods shows that these pods are in the Running state, showing that these pods are working properly.
$ oc get pods -n openshift-monitoring -l 'app in (node-exporter,kube-state-metrics)'
NAME READY STATUS RESTARTS AGE
kube-state-metrics-5fb768889-sj7g2 3/3 Running 0 1d
node-exporter-4j9w5 2/2 Running 0 1d
node-exporter-56px8 2/2 Running 0 1d
node-exporter-wgvbj 2/2 Running 0 1d
Environment
- OpenShift Container Platform 3.11
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.