Prometheus is using excessive amount of memory

Solution In Progress - Updated -

Issue

  • On a large Bare Metal RHOCP cluster (~4750 pods) the memory consumption of Prometheus is excessive.

    max(prometheus_tsdb_head_series{pod='prometheus-k8s-1',namespace='openshift-monitoring'}) = 4561965
    
  • When remote_write is turned off, the memory usage of pods is according to the expectation (~40GB).

  • When remote_write is turned on, the pods keep failing with OOM kill at 70Gb container limit.

Environment

  • Red Hat OpenShift Container Platform (RHOCP) 4.12
  • OpenShift Sandboxed Containers

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content