Network cluster operator stuck in progressing state after new cluster provisioning due to disk pressure in RHOCP4

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4.16

Issue

  • Network cluster operator stuck in progressing state during new 4.16 cluster provisioning with following error messages:

    message: 
     'DaemonSet "/openshift-multus/network-metrics-daemon" is not available (awaiting 1 nodes)
      DaemonSet "/openshift-network-operator/iptables-alerter" is not available (awaiting 1 nodes)'
    

Resolution

Restart all the pods in openshift-network-operator, openshift-network-diagnostics and openshift-multus namespaces:

$ oc delete pods --all -n  openshift-network-operator


$ oc delete pods --all -n openshift-network-diagnostics


$ oc delete pods --all -n  openshift-multus

Root Cause

One of the master node was reporting kubelet has disk pressure message. Network operator went into degraded state as pods related to network operator failed to schedule on the node. Pods failed to schedule due to node's disk pressure. Make sure the master node disk has enough space and the utilisation should be below the defined pressure threshold.

Diagnostic Steps

  • Verify the status of network-operator:

    $ oc get co | grep -i "network"
    
    NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
    network                                    4.16.21   True        True          False      26m
    
  • Verify if any node has disk pressure:

    $ oc describe node <node_name> | grep -i "kubelet has disk pressure"
    
    DiskPressure     True    Tue, 17 Jun 2025 15:54:57 +0000   Tue, 17 Jun 2025 15:54:57 +0000       KubeletHasDiskPressure       kubelet has disk pressure
    
    
  • Verify the pods status in openshift-network-operator, openshift-network-diagnostics, openshift-multus namespace:

    $ oc get pods -A | grep -i "Failed"
    
    
    openshift-multus                  network-metrics-daemon-n59zh          0/2     Failed            0          10m
    openshift-multus                  network-metrics-daemon-zb8q7          0/2     Failed            0          1m32s
    openshift-network-diagnostics     network-check-target-d466f            0/1     Failed            0          6m
    openshift-network-diagnostics     network-check-target-n627q            0/1     Failed            0          6m
    openshift-network-operator        iptables-alerter-srhzt                0/1     Failed            0          16m
    openshift-network-operator        iptables-alerter-tktpg                0/1     Failed            0          7m
    
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments