The addon-operator-manager is in CrashLoopBackOff in OSD/ROSA
Environment
- Red Hat OpenShift Service on AWS (ROSA)
- 4
- Red Hat OpenShift Dedicated (OSD)
- 4
- Addon Operator (
addon-operator
)
Issue
-
The
addon-operator-manager
is in CrashLoopBackOff in OSD/ROSA:addon-operator-manager-xxxxxxxxxx-xxxxx 1/2 CrashLoopBackOff 7 (5m11s ago) 33m
-
The following errors are shown in the
addon-operator-manager
pod:ERROR controller-runtime.source if kind is a CRD, it should be installed before calling Start {"kind": "ClusterObjectTemplate.package-operator.run", "error": "no matches for kind \"ClusterObjectTemplate\" in version \"package-operator.run/v1alpha1\""} sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1.1
Resolution
Red Hat is aware of this issue and the internal task MTSRE-1232 was opened to track it. This issue does not affect the rest of the cluster.
Root Cause
The package-operator
is failing to be installed in the cluster, which causes some missing CRDs that prevent the addon-operator
to run. This issue is seen in OSD/ROSA clusters with the cluster-wide proxy configured.
Diagnostic Steps
Check the failing addon-operator-manager
and check their logs for errors related with missing CRDs:
$ oc get pods -n openshift-addon-operator
NAME READY STATUS RESTARTS AGE
[...]
addon-operator-manager-xxxxxxxxxx-xxxxx 1/2 CrashLoopBackOff 7 (5m11s ago) 33m
[...]
$ oc logs -n openshift-addon-operator -c manager addon-operator-manager-xxxxxxxxxx-xxxxx | grep "CRD"
2023-01-01T00:00:00Z ERROR controller-runtime.source if kind is a CRD, it should be installed before calling Start {"kind": "ClusterObjectTemplate.package-operator.run", "error": "no matches for kind \"ClusterObjectTemplate\" in version \"package-operator.run/v1alpha1\""}
[...]
Check that there are no pods in the openshift-package-operator
namespace, and that the event shows that the job package-operator-bootstrap
reached the specified backoff limit:
$ oc get pods -n openshift-package-operator
No resources found in openshift-package-operator namespace.
$ oc get events -n openshift-package-operator --sort-by='{.lastTimestamp}'
LAST SEEN TYPE REASON OBJECT MESSAGE
71m Normal Scheduled pod/package-operator-bootstrap-xxxxx Successfully assigned openshift-package-operator/package-operator-bootstrap-xxxx to ip-10-xxx-xxx-xxx.ec2.internal by ip-10-xxx-xxx-xxx
71m Normal SuccessfulCreate job/package-operator-bootstrap Created pod: package-operator-bootstrap-xxxxx
71m Normal AddedInterface pod/package-operator-bootstrap-xxxxx Add eth0 [10.xxx.xxx.xxx/xx] from openshift-sdn
69m Normal Pulled pod/package-operator-bootstrap-xxxxx Container image "quay.io/app-sre/package-operator-manager:xxxxxxxx" already present on machine
69m Normal Created pod/package-operator-bootstrap-xxxxx Created container package-operator
69m Normal Started pod/package-operator-bootstrap-xxxxx Started container package-operator
66m Warning BackOff pod/package-operator-bootstrap-xxxxx Back-off restarting failed container
64m Normal SuccessfulDelete job/package-operator-bootstrap Deleted pod: package-operator-bootstrap-xxxxx
64m Warning BackoffLimitExceeded job/package-operator-bootstrap Job has reached the specified backoff limit
Check if the cluster-wide proxy is configured in the affected cluster (it could be only the trustedCA
field):
$ oc get proxy cluster -o yaml
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments