During migrations with MTC the new registry pod dies and migration never completes
Issue
The registry pod is frequently removed and recreated:
$ oc get pods
NAME READY STATUS RESTARTS AGE
migration-log-reader-74bfbd6cb6-pldpm 2/2 Running 1 27h
migration-operator-5fd68c7c65-jfpfd 1/1 Running 1 27h
registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-ng7fx-74b94cctd5h 1/1 Running 0 5m
restic-2rmlh 1/1 Running 1 27h
...
This operation is also reflected in the project events:
$ oc get events
LAST SEEN TYPE REASON OBJECT MESSAGE
19m Normal SuccessfulCreate replicaset/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c879b9 Created pod: registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c8nd6pf
19m Normal Scheduled pod/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c8nd6pf Successfully assigned openshift-migration/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c8nd6pf to worker4-lab
19m Normal Pulled pod/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c8nd6pf Container image "registry.redhat.io/rhmtc/openshift-migration-registry-rhel8:v1.7.4-7" already present on machine
19m Normal Created pod/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c8nd6pf Created container
19m Normal Started pod/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c8nd6pf Started container
13m Normal Killing pod/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c8nd6pf Killing container with id docker://registry:Need to kill Pod
19m Normal ScalingReplicaSet deployment/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv Scaled up replica set registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-bmqlv-6cb8c879b9 to 1
5m Normal SuccessfulCreate replicaset/registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-ng7fx-74b94cbd55 Created pod: registry-668e4c06-8f6d-4c8f-b34d-e03845428e67-ng7fx-74b94cctd5h
...
Registry pod can end in crashLoopBackOff status and migration is blocked as conditions are not meet:
Conditions:
Category: Critical
Last Transition Time: 2022-11-14T10:56:12Z
Message: Reconcile failed: [unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request]. See controller logs for details.
Status: True
Type: ReconcileFailed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Postponed 6m32s (x2 over 6m42s) miganalytic_controller Waiting 10 seconds for referenced MigPlan to become ready.
Warning ReconcileFailed 95s (x29 over 6m21s) miganalytic_controller Reconcile failed: []. See controller logs for details.
Environment
- Redhat OpenShift Container Platform 4.10
- Migration Toolkit for Containers 1.7.4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.