Faulty ServiceMonitor definition in user-defined projects are causing Prometheus to fail loading the configuration in OpenShift Container Platform 4
Issue
- Wrongly configured CR
ServiceMonitor
makes user-workload-monitoringPrometheus
failing to reload it's configuration. -
Following
ServiceMonitor
is causing configuration reload error in user-defined projects monitoring stack.apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: python-metrics namespace: project-101 spec: endpoints: - interval: 60s port: http scrapeTimeout: 120s jobLabel: app.kubernetes.io/name selector: matchLabels: app: httpd
-
Below error is reported by user-defined projects
Prometheus
when an invalidServiceMonitor
definition is created.level=info ts=2022-01-06T11:00:07.928Z caller=main.go:986 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml level=error ts=2022-01-06T11:00:07.929Z caller=main.go:763 msg="Error reloading config" err="couldn't load configuration (--config.file=\"/etc/prometheus/config_out/prometheus.env.yaml\"): parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: scrape timeout greater than scrape interval for scrape config with job name \"serviceMonitor/project-101/python-metrics/0\"" level=info ts=2022-01-06T11:00:12.928Z caller=main.go:986 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml level=error ts=2022-01-06T11:00:12.929Z caller=main.go:763 msg="Error reloading config" err="couldn't load configuration (--config.file=\"/etc/prometheus/config_out/prometheus.env.yaml\"): parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: scrape timeout greater than scrape interval for scrape config with job name \"serviceMonitor/project-101/python-metrics/0\""
Environment
- Red Hat OpenShift Container Platform (RHOCP) 4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.