kube-controller-manager timeout is exceeded by validating webhook during CNI restart leading to degraded cluster state
Issue
Kube Controller Manager
pods fail to have a leader election.- Networking is degraded and
Kube Controller Manager
is incrashloopbackoff
state. - Configmaps are taking longer than 5s to create but deletion occurs in under a second as expected.
- The status of the resources do not match the reality:
- After following guidance to restart
OVNKube-Node
pods as part of a reset ofOVN
databases, thedaemonset
reports all pods inREADY
state, but no pods are running on the corresponding nodes. - The nodes status is
Ready
, but they are not.
- After following guidance to restart
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.11+
- An operator that defines rules for
configmaps
with a timeout bigger than 5s is installed.- In the examples below the
Aqua operator
is used.
- In the examples below the
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.