OpenShift 4 OCS storage node goes out of memory and freezes

Solution Verified - Updated -

Issue

  • The OCS storage node runs out of memory and freezes, memory mostly utilized by Kubelet and Object Storage Devices (OSD) process. Marking the node Unschedulable and rebooting it, temporary recovers the node, but it doesn’t resolve the issue, as once the node is marked as Schedulable, the memory usage starts growing, until it fills the memory, which leads to the node become unresponsive and freezes.

  • Events on openshift-storage namespace show the errors similar to the ones shown below:

    # oc get events -n openshift-storage|grep -vw "Normal"
    ......
    Error: fork/exec /usr/bin/conmon: resource temporarily unavailable
    Error: fork/exec /usr/bin/conmon: resource temporarily unavailable
    ......
    
    0/12 nodes are available: 1 Insufficient cpu ...
    0/12 nodes are available: 1 Insufficient cpu ....
    .....
    

Environment

  • Red Hat OpenShift Container Platform (OCP) 4.8
  • Red Hat OpenShift Container Storage (OCS) 4.8

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content