Worker node goes into a not ready state in OpenShift 4
Issue
- Load average is very high, over 300 and these are ~50 cpus/threads workers.
- Tons of "x process is being blocked for 600 seconds" messages reported to dmesg, so seems there is a problem from a kernel perspective to pair a process with a cpu to run tasks.
- Many process in "D" uninterruptible state, so waiting for something in a hung state.
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.8
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.