MachineConfigPool stuck in UPDATING state after upgrade to 4.7.31
Issue
-
MachineConfigPool
for the worker was in anUPDATING
state for a long time after upgrading OpenShift from v4.6.34 to v4.7.31. -
MCC was showing this error:
E1011 16:23:51.653195 1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
- Machine config daemon was showing that the render was correctly applied to the node:
daemon.go:1085] Validating against pending config rendered-worker-bf941581bf35ab13e1e8656cf78d9a1f
daemon.go:1096] Validated on-disk state
daemon.go:1151] Completing pending config rendered-worker-bf941581bf35ab13e1e8656cf78d9a1f
update.go:199] cordon/uncordon failed with: cordon error: rpc error: code = Unavailable desc = transport is closing, retrying
update.go:1943] completed update for config rendered-worker-bf941581bf35ab13e1e8656cf78d9a1f
daemon.go:1167] In desired config rendered-worker-bf941581bf35ab13e1e8656cf78d9a1f
-
MCO has completed the update from its end but the node is still marked as
unschedulable
. -
Worker MCP never successfully completed the update due to the cordon never being successfully removed.
Environment
-
Red Hat OpenShift Container Platform [RHOCP]
- 4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.