ROSA Hypershift - routes fail after swapping application router between public and private

Solution In Progress - Updated -

Environment

  • ROSA - Hosted Control Planes
    • version 4.12

Issue

  • How should I configure the Ingress Routers in OpenShift Cluster Manager to ensure all applications are private and specific applications can be exposed to the public internet?
  • I want to expose my applications from private to public?
  • I cannot access my OpenShift console after changing ingress from Public to Private and vice versa.

Resolution

  • Temporary workaround for this issue was to delete the router-default pods in the openshift-ingress namespace.
  • The cluster ends up healing, with the AWS LB seeing the backend infra nodes in service again, and cluster
    operators depending on the *apps.CLUSTERNAME.XXXXXX.org domain healing on their own as well.
  • OpenShift will automatically recreate those pods and it will resolve this issue.

Root Cause

  • When modifying the "Edit Cluster Ingress" feature in the OCM console to change the default application router from public to private or vice versa, the external AWS load balancer is removed and replaced by the cloud-ingress-operator.
  • When this happens, the external load balancer health checks never receive a successful check from the backend nodes, and all nodes are marked out-of-service.
  • This is a known bug which was mentioned here.

Diagnostic Steps

  • If you have access to your AWS console you can go to:

    EC2 -> Load Balancer ( choose your load balancer for your cluster ) - Click Instances Tab

  • You should see all of the status of your instances as OutOfService .

    Sample LB output with Issues:

    Sample

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments