Load Balancers failing to deploy due to issues finding suitable subnets
Environment
- Red Hat OpenShift Service on AWS (ROSA)
- Red Hat OpenShift Service on AWS with Hosted Control Plane (ROSA HCP)
Issue
In ROSA Classic / ROSA HCP the load balancers backing IngressControllers are failing to deploy with the following error message:
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
...
ingress 4.20.3 False True True 102m The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB...
Additionally, other cluster operators like console or authentication may show as degraded (as they depend on the default ingressController)
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
console 4.14.8 False False False 8d RouteHealthAvailable: failed to GET route (<route details>): Get "<route details>": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Or
Resolution
If the symptoms described in the Diagnosis session are true, then proceed with below steps:
-
Ensure that all
kubernetes.io/cluster/.*tags on the subnet, which reference other clusters, have their values set to shared. -
Tag the subnet with the clusters infrastructure ID as follows:
Get the infrastructure ID of the cluster:
- for ROSA Classic:
ocm get cluster $CLUSTER_ID | jq '.infra_id' - for ROSA HCP:
ocm get cluster $CLUSTER_ID | jq -r '.id'
Add the tag kubernetes.io/cluster/${INFRA_ID}:shared to the subnets of the VPC that hosts the ROSA cluster.
Optionally, add the kubernetes.io/role/internal-elb:1 tags to the private subnets and kubernetes.io/role/elb:1 to the public subnets.
Root Cause
Load balancer does not spin up because it can not find suitable subnets. The issue usually appears when subnets are shared with past or present EKS clusters.
The subnet selection process prefers clusters tagged for the current cluster and excludes subnets tagged for other clusters.
Diagnostic Steps
- Check if the subnets are tagged
kubernetes.io/cluster/<cluster-identifier>for another cluster or using the cluster's name. - Check if the subnets are missing the
kubernetes.io/cluster/<cluster-identifier>tags. - Identify if there are other clusters running in the same VPC and/or sharing subnets.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments