CustomDomain NotReady in AWS PrivateLink ROSA cluster
Environment
- Red Hat OpenShift Service on AWS (ROSA)
- AWS PrivateLink ROSA cluster
Issue
- Creating a
CustomDomain
in AWS PrivateLink ROSA cluster never finish. -
The
CustomDomain
CR is inNotReady
status in AWS PrivateLink ROSA cluster, with reasonCreating
:status: conditions: - message: Creating Apps Custom Domain (apps.tc01686-dev.afs1-nprd.aws-za.sbgrp.cloud) reason: Creating status: "True" type: Creating
-
The
ingress
ClusterOperator
is degraded with the following message:message: 'Some ingresscontrollers are degraded: ingresscontroller "console-domain" is degraded: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)' reason: IngressControllersDegraded status: "True" type: Degraded
-
The
ingresscontroller
generated by theCustomDomain
CR show the following messages:message: 'One or more status conditions indicate unavailable: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)'
message: 'One or more other status conditions indicate a degraded state: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)'
-
The
ingress-operator
shows the following errors:ERROR operator.ingress_controller controller/controller.go:244 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: LoadBalancerRead y=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB\nThe kube-controller-manager logs may contain more details.)"}
ERROR operator.ingress_controller controller/controller.go:244 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)"}
Resolution
Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.
Custom Domains require a public facing ELB, and PrivateLink ROSA clusters doesn't have a public facing ELB by default. Internal Custom Domains can be configured in PrivateLink ROSA clusters. Refer to the documentation for configuring custom domains for applications.
There is an example for adding a Public Ingress endpoint to a ROSA Private-Link Cluster, but please, note that the steps shown in that article are not supported by Red Hat Support or the Red Hat SRE Team.
Root Cause
Custom Domains require a public facing ELB. If there are not public subnets, the controller will not put an ELB on the private subnets.
Diagnostic Steps
Check the CustomDomain
status and messages:
$ oc get customdomain
NAME ENDPOINT DOMAIN STATUS
console-domain [my_cluster_domain] NotReady
$ oc get customdomain console-domain -o yaml
[...]
status:
conditions:
[...]
- message: Creating Apps Custom Domain (apps.tc01686-dev.afs1-nprd.aws-za.sbgrp.cloud)
reason: Creating
status: "True"
type: Creating
[...]
Check the messages in the ingresscontroller
generated by the CustomDomain
CR:
$ oc get ingresscontroller console-domain -n openshift-ingress-operator -o yaml
[...]
- message: 'One or more status conditions indicate unavailable: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)'
reason: IngressControllerUnavailable
status: "False"
type: Available
[...]
- message: 'One or more other status conditions indicate a degraded state: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)'
reason: DegradedConditions
status: "True"
type: Degraded
[...]
Check the ingress
ClusterOperator
status and messages:
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
[...]
ingress 4.7.29 False True True 15h
[...]
$ oc get co ingress -o yaml
[...]
status:
conditions:
[...]
- message: 'Some ingresscontrollers are degraded: ingresscontroller "console-domain" is degraded: DegradedConditions: One or more other status conditions indicate a degraded state: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)'
reason: IngressControllersDegraded
status: "True"
type: Degraded
[...]
Check the ingress-operator
logs:
$ oc logs -n openshift-ingress-operator -c ingress-operator ingress-operator-7694d685cf-d6jkb
[...]
ERROR operator.ingress_controller controller/controller.go:244 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: LoadBalancerRead
y=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB\nThe kube-controller-manager logs may contain more details.)"}
[...]
[...]
ERROR operator.ingress_controller controller/controller.go:244 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: LoadBalancerReady=False (LoadBalancerPending: The LoadBalancer service is pending)"}
[...]
Check the kube-controller-manager
logs:
$ oc get pods -n openshift-kube-controller-manager -l kube-controller-manager
NAME READY STATUS RESTARTS AGE
kube-controller-manager-[master-0_name] 4/4 Running 2 17d
kube-controller-manager-[master-1_name] 4/4 Running 3 17d
kube-controller-manager-[master-2_name] 4/4 Running 4 17d
$ oc logs -n openshift-kube-controller-manager kube-controller-manager-[master-0_name]
[...]
I1129 21:55:50.800103 1 controller.go:368] Ensuring load balancer for service openshift-ingress/router-console-domain
I1129 21:55:50.800182 1 aws.go:3788] EnsureLoadBalancer(rosapoc-6s9x2, openshift-ingress, router-console-domain, af-south-1, , [{http TCP <nil> 80 {1 0 http} 32458} {https TCP <nil> 443 {1 0 https} 31694}], map[service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold:2 service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval:5 service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout:4 service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold:2 service.beta.kubernetes.io/aws-load-balancer-proxy-protocol:*])
I1129 21:55:50.800235 1 event.go:291] "Event occurred" object="openshift-ingress/router-console-domain" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
I1129 21:55:50.917073 1 aws.go:3440] Ignoring private subnet for public ELB "subnet-0768119a8539c210e"
I1129 21:55:50.917095 1 aws.go:3440] Ignoring private subnet for public ELB "subnet-07bd8f81179783c5a"
I1129 21:55:50.917101 1 aws.go:3440] Ignoring private subnet for public ELB "subnet-08c1489cee92c365a"
E1129 21:55:50.917139 1 controller.go:275] error processing service openshift-ingress/router-console-domain (will retry): failed to ensure load balancer: could not find any suitable subnets for creating the ELB
I1129 21:55:50.917223 1 event.go:291] "Event occurred" object="openshift-ingress/router-console-domain" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB"
[...]
Check if the cluster is an AWS PrivateLink ROSA cluster:
$ ocm describe cluster [cluster_id]
[...]
PrivateLink: true
[...]
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments