Alertmanager Pod keeps recreating itself and never deploys in OpenShift 4
Issue
- Among multiple Alertmanager-main pods,
alertmanager-main-1
keeps recreating itself and never deployed, whilealertmanager-main-0
has deployed.
$ oc get sts -n openshift-monitoring alertmanager-main
NAME READY AGE
alertmanager-main 1/2 4d
$ oc get pods -n openshift-monitoring -l alertmanager=main
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 6/6 Running 0 4d
alertmanager-main-1 0/6 Terminating 0 0s
-
Re-creating the alertmanager pod or Statefulset manually will not improve the situation.
-
The event log shows a message that Statefulset is starting a rolling update of
alertmanager-main-1
, but the pod is terminated immediately after it is created.
$ omc -n openshift-kube-controller-manager logs kube-controller-manager-master-0.fd-sandbox.mano.local -c kube-controller-manager | tail -n 2
2022-XX-XX 1 stateful_set_control.go:571] "Pod of StatefulSet is terminating for update" statefulSet="openshift-monitoring/alertmanager-main" pod="openshift-monitoring/alertmanager-main-1"
2022-XX-XX 1 event.go:294] "Event occurred" object="openshift-monitoring/alertmanager-main" fieldPath="" kind="StatefulSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulDelete" message="delete Pod alertmanager-main-1 in StatefulSet alertmanager-main successful"
- Also, Prometheus-k8s of statefulset has not been deployed.
$ oc get sts -n openshift-monitoring prometheus-k8s
No resources found in openshift-monitoring namespace.
- How do I get the Alertmanager and Prometheus-k8s pods to deploy and collect metrics successfully?
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.