kube-controller-manager timeout is exceeded by validating webhook during CNI restart leading to degraded cluster state

Solution Verified - Updated 2024-06-13T19:43:02+00:00 -

Issue

Kube Controller Manager pods fail to have a leader election.
Networking is degraded and Kube Controller Manager is in crashloopbackoff state.
Configmaps are taking longer than 5s to create but deletion occurs in under a second as expected.
The status of the resources do not match the reality:
- After following guidance to restart OVNKube-Node pods as part of a reset of OVN databases, the daemonset reports all pods in READY state, but no pods are running on the corresponding nodes.
- The nodes status is Ready, but they are not.

Red Hat OpenShift Container Platform (RHOCP)
- 4.11+
An operator that defines rules for configmaps with a timeout bigger than 5s is installed.
- In the examples below the Aqua operator is used.

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.