Thanos querier pods gets OOMKilled when loading the API performance dashboard with large time ranges in RHOCP 4

Solution Verified - Updated -

Issue

  • When using the "API Performance" dashboard under "Observe --> Dashboard" and trying to query the metrics for more than 1 week, the thanos-querier pod gets OOMKilled or the dashboard returns Error Loading Alert Gateway Time-out.
  • The thanos-querier pod shows the container as terminated with reason: OOMKilled:

        [...]
        lastState:
          terminated:
            containerID: cri-o://XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
            exitCode: 137
            [...]
            reason: OOMKilled
    [...]
    
  • The dashboard shows a warning similar to the below images:

    Warning

    Warning

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4
  • Thanos

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content