Fresh OpenShift 4.xcluster fails to recover after stopping nodes before 24 hours have passed.
Issue
- After powering down a recently deployed 4.19 Openshift cluster, once the cluster has been powered up again, cluster doesn't work properly. Logs from pods can't be obtained, pods are stuck in
Terminatingstate and no new pods are scheduled in the cluster. However, apiserver is up. - Pending
csrare seen and manual approval is needed, however, after approving the certificates the cluster doesn't work either. - Despite Kube Apiserver being up and nodes listed as
Ready, any action that requires kubelet connection fails likeoc logs,oc rshoroc debug. - Restarting or creating new pods doesn't work either.
- Nodes have been stopped less than 24h after the cluster has been deployed.
Environment
- Openshift Container Platform 4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.