RHOCP 4 - Routes are unreachable after upgrading OpenShift to 4.13 (503 or timeouts) with OpenShiftSDN

Solution Unverified - Updated -

Issue

  • Intermittent 503 errors from backend routes when curled from clients to pods in namespaces where allow-from-openshift-ingress is applied.
  • Reviewing the access logs from router-default pods in the Openshift-ingress namespace indicates that the pods are constantly being marked down/unavailable then coming back up.
  • Curling to a target backend directly from a host node indicates that intermittently the packet will time out/drop, resulting in a TCP connection timeout.
  • Checking Pod to Pod connectivity, it is found that only the Pods on the same node have networking connectivity. In other words, the router Pod can not connect to the Pods which don't run on the same node with the router Pod.

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4.13
  • OpenShift SDN
  • OpenShift Routers

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content