Kubelet health check failing with no such host messages in machine-config-daemon
Issue
- The cluster is running as expected with all cluster-operators available, but observing
KubeletHealthState
alerts, with the following message:
alertname=KubeletHealthState
endpoint=metrics
err=Get http://localhost:10248/healthz: dial tcp: lookup localhost on 10.4.14.67:53: no such host
instance=10.4.14.67:9001
job=machine-config-daemon
namespace=openshift-machine-config-operator
pod=machine-config-daemon-xxxxx
service=machine-config-daemon
severity=warning
- Observing following messages in some machine-config-daemon pods:
W0917 11:21:46.203938 1688435 daemon.go:662] Failed kubelet health check: Get http://localhost:10248/healthz: dial tcp: lookup localhost on <node-ip>:53: no such host
W0917 11:21:46.204070 1688435 daemon.go:596] Got an error from auxiliary tools: kubelet health failure threshold reached
Environment
- Red Hat OpenShift Container Platform [RHOCP]
- 4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.