Pods are not rescheduled when node becomes NotReady
Issue
- During failover testing, bringing a node offline does not cause a pod that was scheduled on that node to reschedule on another node.
- I lost a node, and the pods on the node did not come back up on another node as they should have. This resulted in unavailable services.
- One node has become
NotReady
, and now there are pods stuck inTerminating
state. - Pods on Node in state
NotReady
not rescheduled and stay inRunning
after 15 minutes.
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 3
- 4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.