fluentd pods cannot be scraped by cluster monitoring after upgrade and node failure
Issue
- OCP console alerting screen shows all
fluentdpods cannot be scraped. fluentdtarget is showing down in Prometheus UI.error as below:
"Get \"https://10.x.x.x:00000/metrics\": dial tcp 10.x.x.x:00000: connect: connection refused"
fluentdmetric not available and alerts are firing as the target is down as shown below, whereas expected behaviour is thefluentdpod metric be able to scrape by Prometheus.
fluentd pod are running
fluentd-2dbd7 1/1 Running 0
fluentd-8rtxj 1/1 Running 0
fluentd-bn9s5 1/1 Running 0
fluentd-fxhsd 1/1 Running 0
fluentd-kjn29 1/1 Running 0
fluentd-x8cz6 1/1 Running 0
Environment
- OpenShift Container Platform (OCP) 4.6
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.