Balancing incoming requests evenly to available pod replicas in OpenShift
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4
- Routes
Issue
- Incoming requests are not getting distributed equally to all available pods.
- How to balance the load among available pod replicas?
- Is it possible to configure load balancing algorithm on router?
- Even traffic distribution is not observed after setting a route with annotation
haproxy.router.openshift.io/balance=roundrobin
andhaproxy.router.openshift.io/disable_cookies=true
. - Some pod gets more traffic than others even with balancing option set to
roundrobin
. - Traffic is not distributed evenly in Round Robin fashion while accessing application through routes in cloud environment.
Resolution
When any request calls a pod externally, it goes directly from Router pods to Application Pod by selecting a endpoint from haproxy.config
file present in each of the router pods. The default behavior of the HAProxy router is to use a cookie to ensure "sticky" routing. This enables sessions to remain with the same pod, so if the number of different clients is not big enough and cookies are in use, those clients may be always directed to the same pod(s).
This behavior can be disabled by setting a haproxy.router.openshift.io/disable_cookies
annotation
on the route to true
.
After disabling "sticky" routing with annotation
above, it is possible to select a load balancing algorithm to balance the incoming requests. Use another annotation
to select a load balancing algorithm with haproxy.router.openshift.io/balance
as key and one of roundrobin
, leastconn
, source
or random
as value (refer to route-specific annotations for additional information). The default value was set to random
in NE-825, but for TLS passthrough routes, for which it is source
.
Note: there could be other scenarios causing this same issue even if the above
annotations
are configured. Please refer to the "Root Cause" section for other scenarios and examples.
How to choose between load balancing algorithms?
The roundrobin
algorithm will distribute the requests evenly on pods and it works best when pods/servers have roughly identical computing capabilities and storage capacity. On the other hand, leastconn
load balancing is a dynamic load balancing algorithm where client requests are distributed to the application server with the least number of active connections at the time the client request is received.
Root Cause
The default behavior of the HAProxy router is to use a cookie to ensure "sticky" routing. This enables sessions to remain with the same pod resulting in few pods getting overloaded where as few pods being less utilized.
Additional scenarios that could cause this behavior even if the annotations
are configured
The Round Robin state is not preserved across reloads, so whenever a reload happens, route balancing stats get reset and router starts balancing connection from beginning.
- For example, say for 1-8 pods, router is balancing traffic and that has reach fourth pod, now if in between reload happens, post reload router will continue balancing connection again from pod 1 and not from pod 4th. This could result in uneven balancing in short term and may happen in long term as well if reload is too frequent.
In many architectures, there are many router instances where many ingresscontroller
pods are balanced by a front Load Balancer. None of these ingresscontroller
instances share information about the Round Robin state with each other, so uneven traffic distribution can happen.
- For example: Imagine that there are 2 ingress controller pods and 8 application pods. Then, first request goes to ingress controller 1 and it sends it to pod 1, then second request goes to ingress controller pod 1 which sends it to pod 2 but the third request goes to ingress controller 2 and it sends it to pod 1 (because it doesn't know that ingress controller one already sent requests to pods 1 and 2).
If the route is passthrough
, the router cannot decrypt requests and just acts as a L4 Load Balancer, so if the client opens very few TCP connections and reuses them (i.e. it does heavy HTTP keep-alive), as each connection goes to one pod only, those pods that got the few connections will be the ones getting all the requests.
Diagnostic Steps
-
Check for router pods log and verify if following reload line is continuosly logged at every 5s or very frequently:
$ oc get pods -n openshift-ingress [...] $ oc logs -n openshift-ingress [router-default-pod_name] [...] I0101 00:00:00.000000 1 router.go:669] template "msg"="router reloaded" "output"=" - Checking http://localhost:80 ...\n - Health check ok : 0 retry attempt(s).\n"
One can reproduce this behavior easily with deleting endpoint (from different namespace) constantly while curl is running in loop. Deleting endpoint will ensure router reload and that would help to reset balancing stats.
-
Check if there are several external Load Balacers in front of the OpenShift routers (
[route_url]
without http(s)):$ dig [route_url] [...] ;; ANSWER SECTION: [route_url]. 296 IN A X.X.X.X [route_url]. 296 IN A Y.Y.Y.Y [route_url]. 296 IN A Z.Z.Z.Z [...]
-
Enable the HAProxy ingress router access log in Openshift 4 and run the following loop with a
curl
to theroute
using therouter-ip
:$ oc get routes -n [namespace_name] [...] $ oc get pods -n openshift-ingress -o wide [...] $ for i in {1..20}; do echo "-------- ${i} --------"; curl -kvvvv https://[route_url] --resolve [route_url]:443:[router-ip] ; echo; done &> curl_output.txt
Note: after the loop finish, disable the access log in the HAProxy if they are no longer required.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments