AlertmanagerConfig with missing options causes Alertmanager to crash
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.10+
Issue
AlertmanagerConfig
object created inuser defined projects
can causeAlertmanager
to crash when restarted.- An incomplete configuration of
AlertmanagerConfig
is allowed to be created without the validation. -
Alertmanager
pods will be inCrashLoopBackOff
state:$ oc get pods -n openshift-user-workload-monitoring | grep alert NAME READY STATUS RESTARTS AGE alertmanager-user-workload-0 5/6 CrashLoopBackOff 1 (3s ago) 23s
-
Alertmanager
pods show error like below:ts=2023-09-05T12:07:33.449Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config_out/alertmanager.env.yaml err="no global SMTP smarthost set"
Resolution
This issue has been reported to Red Hat engineering. It is being tracked in OCPBUGS-18656.
For more information, please open a new support case with Red Hat Support.
Workaround[1]:Add the smtp_from
and smtp_smarthost
to the global
section of the Alertmanager
like below:
-
Print the currently active
Alertmanager
configuration into filealertmanager.yaml
:$ oc -n openshift-user-worklod-monitoring get secret alertmanager-user-workload --template='{{ index .data "alertmanager.yaml" }}' | base64 --decode > alertmanager.yaml
-
Edit the configuration in
alertmanager.yaml
:"global": smtp_from: noreply_uwm@example.com smtp_smarthost: smtp.example.com:25
-
Apply the new configuration in the file:
$ oc -n openshift-user-workload-monitoring create secret generic alertmanager-user-workload --from- file=alertmanager.yaml --dry-run=client -o=yaml | oc -n openshift-user-workload-monitoring replace secret -- filename=-
Workaround[2]:Add the from
and smarthost
in the AlertmanagerConfig's emailConfigs section
like below:
$ oc edit AlertmanagerConfig <alertmanagerconfig-name>
spec:
receivers:
- name: 'email_receiver'
emailConfigs:
- to: 'your-email@example.com'
from: 'alertmanager@example.com'
smarthost: 'smtp.example.com:587'
authUsername: 'your-email@example.com'
authPassword:
name: 'smtp-password-secret'
key: 'password'
Note: smarthost
and from
should be defined in either global section of the Alertmanager
or AlertmanagerConfig
object.
Root Cause
When the AlertmanagerConfig
object without options smtp_from
and smtp_smarthost
is created, the error appears.
Diagnostic Steps
-
The following error appears in the
Alertmanager
pods:$ oc logs alertmanager-user-workload-0 -c alertmanager -n openshift-user-workload-monitoring ts=2023-09-12T16:42:52.626Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config_out/alertmanager.env.yaml err="no global SMTP smarthost set" ts=2023-09-12T16:42:52.626Z caller=cluster.go:690 level=info component=cluster msg="gossip not settled but continuing anyway" polls=0 elapsed=20.800007ms
-
Global section doesn't contain
smtp_from
andsmtp_smarthost
:$ oc -n openshift-user-worklod-monitoring get secret alertmanager-user-workload --template='{{ index .data "alertmanager.yaml" }}' | base64 --decode > alertmanager.yaml
global: resolve_timeout: 5m http_config: follow_redirects: true smtp_hello: localhost smtp_require_tls: true route: receiver: Default
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments