Haproxy pods rapidly consuming memory, not closing haproxy threads
Issue
-
Haproxy routers are consuming massive amounts of memory and failing to close haproxy threads. This leads to infra nodes at maximum utilization, and OOM-killing other processes at random, including prometheus.
-
$ oc adm top nodes
shows infra nodes at 97%+ memory utilization -
Prometheus/Grafana show openshift-ingress namespace consuming all/most of memory budget on infra nodes.
-
SOS report or node debug indicate that haproxy is opening and never closing new processes. Expanding node memory just allows the problem to balloon further into new memory space instead of resolving the pressure issue.
Environment
Red Hat OpenShift Container Platform (OCP)
- OCP 4.8.0 - 4.8.18
- OCP 4.9.1 - 4.9.5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.