Fresh OpenShift 4.xcluster fails to recover after stopping nodes before 24 hours have passed.

Solution Verified - Updated -

Issue

  • After powering down a recently deployed 4.19 Openshift cluster, once the cluster has been powered up again, cluster doesn't work properly. Logs from pods can't be obtained, pods are stuck in Terminating state and no new pods are scheduled in the cluster. However, apiserver is up.
  • Pending csr are seen and manual approval is needed, however, after approving the certificates the cluster doesn't work either.
  • Despite Kube Apiserver being up and nodes listed as Ready, any action that requires kubelet connection fails like oc logs, oc rsh or oc debug.
  • Restarting or creating new pods doesn't work either.
  • Nodes have been stopped less than 24h after the cluster has been deployed.

Environment

  • Openshift Container Platform 4.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content