Prometheus is using excessive amount of memory
Issue
-
On a large Bare Metal RHOCP cluster (~4750 pods) the memory consumption of Prometheus is excessive.
max(prometheus_tsdb_head_series{pod='prometheus-k8s-1',namespace='openshift-monitoring'}) = 4561965
-
When
remote_write
is turned off, the memory usage of pods is according to the expectation (~40GB). - When
remote_write
is turned on, the pods keep failing with OOM kill at 70Gb container limit.
Environment
- Red Hat OpenShift Container Platform (RHOCP) 4.12
- OpenShift Sandboxed Containers
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.