Worker Node fails to rejoin hosted cluster after repair on Bare Metal Cluster disabling Machine Health Checks
Issue
After a hardware failure and repair (e.g., NIC replacement) of a bare metal worker node in a Hosted Cluster, the re-provisioned node does not successfully join the cluster. The node is booted from its installation ISO and the corresponding host is approved in the management cluster, showing an "Available" status. However, it is never fully integrated as a worker node within the Hosted Cluster.
Environment
- OpenShift Container Platform
- Hosted Control Planes
- Bare Metal Infrastructure
- Machine Health Checks disabled (
spec.management.autoRepair: false
in the NodePool)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.