NodeStatusUnknown or NotReady Node Status reported for 15 Minutes after network interruption between a Node and the API-Server in Red Hat OpenShift Container Platform
Issue
- After upgrade to 3.6 in AWS environment, we got rid of AWS NetworkLoadBalancer in favor of ELB. It is used by nodes to communicate to masters. After the change we suddenly notice 49 nodes that had status
NodeStatusUnknown
and it lasted for 15 minutes - After applying the fix from Several Nodes in NotReady state at once we still have Nodes in
NotReady
state when the ELB is changing it's IP address
Environment
Red Hat OpenShift Container Platform 3
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.