Cluster logging Installation Failed with Pending Pods in RHOCP 4

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4

Issue

  • Cluster logging installation fails with Elasticsearch, fluentd pod in a pending state with the below error message :

    message: '0/8 nodes are available: 1 Insufficient cpu, 1 Insufficient memory,
      6 node(s) didn''t match node selector.'
    

Resolution

Add the below annotation in openshift-logging namespace:

 $ oc edit namespace openshift-logging
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    openshift.io/node-selector: ""                                   <----  add this annotation  to namespace 
    openshift.io/sa.scc.mcs: s0:c26,c0
    openshift.io/sa.scc.supplemental-groups: 1000650000/10000
    openshift.io/sa.scc.uid-range: 1000650000/10000

Root Cause

The defaultNodeSelector settings from scheduler were inherited to the logging stack.

Diagnostic Steps

  • Check the events in the openshift-logging project:

    $ oc get events -n openshift-logging
    
    LAST SEEN   TYPE      REASON             OBJECT                                              MESSAGE
    .
    . 
    .
    22m         Warning   FailedScheduling   pod/fluentd-9bgd4                                   0/6 nodes are available: 6 node(s) didn't match node selector.
    154m        Warning   FailedScheduling   pod/fluentd-h8dr2                                   0/6 nodes are available: 6 node(s) didn't match node selector.
    154m        Warning   FailedScheduling   pod/fluentd-h8dr2                                   0/6 nodes are available: 1 Insufficient cpu, 5 node(s) didn't match node select
    
  • Describe Pending Pod Definition and check if there is any Node-Selector label:

    $ oc describe pod fluentd-9bgd -n openshift-logging| grep -i "Node-Selectors" -A2
    Node-Selectors:  kubernetes.io/os=linux
                    node-role.kubernetes.io/infra=                                <--- node selector label               
    
  • Check if any Node-Selectors label is present on Namespace:

    $ oc get ns openshift-logging -o yaml
    
  • Check scheduler setting if any defaultNodeSelector is present or not:

    $ oc get scheduler/cluster -o yaml
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments