RHOCP 4 - Routes are unreachable after upgrading OpenShift to 4.13 (503 or timeouts) with OpenShiftSDN

Solution Unverified - Updated 2025-01-21T07:44:43+00:00 -

Issue

Intermittent 503 errors from backend routes when curled from clients to pods in namespaces where allow-from-openshift-ingress is applied.
Reviewing the access logs from router-default pods in the Openshift-ingress namespace indicates that the pods are constantly being marked down/unavailable then coming back up.
Curling to a target backend directly from a host node indicates that intermittently the packet will time out/drop, resulting in a TCP connection timeout.
Checking Pod to Pod connectivity, it is found that only the Pods on the same node have networking connectivity. In other words, the router Pod can not connect to the Pods which don't run on the same node with the router Pod.

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.