kube-controller-manager timeout is exceeded by validating webhook during CNI restart leading to degraded cluster state

Solution Verified - Updated -

Issue

  • Kube Controller Manager pods fail to have a leader election.
  • Networking is degraded and Kube Controller Manager is in crashloopbackoff state.
  • Configmaps are taking longer than 5s to create but deletion occurs in under a second as expected.
  • The status of the resources do not match the reality:
    • After following guidance to restart OVNKube-Node pods as part of a reset of OVN databases, the daemonset reports all pods in READY state, but no pods are running on the corresponding nodes.
    • The nodes status is Ready, but they are not.

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4.11+
  • An operator that defines rules for configmaps with a timeout bigger than 5s is installed.
    • In the examples below the Aqua operator is used.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content