OpenShift 4 OCS storage node goes out of memory and freezes
Issue
-
The OCS storage node runs out of memory and freezes, memory mostly utilized by Kubelet and Object Storage Devices (OSD) process. Marking the node Unschedulable and rebooting it, temporary recovers the node, but it doesn’t resolve the issue, as once the node is marked as Schedulable, the memory usage starts growing, until it fills the memory, which leads to the node become unresponsive and freezes.
-
Events on openshift-storage namespace show the errors similar to the ones shown below:
# oc get events -n openshift-storage|grep -vw "Normal" ...... Error: fork/exec /usr/bin/conmon: resource temporarily unavailable Error: fork/exec /usr/bin/conmon: resource temporarily unavailable ...... 0/12 nodes are available: 1 Insufficient cpu ... 0/12 nodes are available: 1 Insufficient cpu .... .....
Environment
- Red Hat OpenShift Container Platform (OCP) 4.8
- Red Hat OpenShift Container Storage (OCS) 4.8
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.