What are the Prometheus metrics equivalent to "oc adm top nodes" in RHOCP4
Environment
- Red Hat OpenShift Container Platform (RHOCP)
- 4.X to 4.15
- Prometheus-adapter
Issue
- How can Prometheus be queried to provide the same output as an
oc adm top nodescommand - How can Prometheus be queried to provide a historical view of
oc adm top nodescommand
Resolution
The following Prometheus query can be used to provide a CPU metric to match oc adm top nodes
instance:node_cpu_utilisation:rate1m{job="node-exporter", cluster=""} * 100
Whilst the following can be used to provide a memory figure to match:
round(((sum by (instance) (node_memory_MemTotal_bytes) - sum by (instance) (node_memory_MemAvailable_bytes))) / sum(label_replace(kube_node_status_allocatable{resource="memory",unit="byte"}, "instance", "$1", "node", "(.*)")) by (instance) * 100)
Root Cause
Prometheus provides granular memory and CPU metrics, for the same result to be returned as an oc adm top command the same queries must be used. whilst also being aware of the design of the Prometheus operator in Openshift.
The current integration of prometheus-adapter in OpenShift uses the platform Prometheus as a backend to get metrics. A limitation of this design is that queries against the adapter can return metrics from 2 different Prometheus instances which don't have replicated data, and will be scraping targets at slightly different moments - at the same scrape interval, but both pods will have started at slightly different moments so two queries sent at the same time to prometheus-adapter might yield different results since the underlying promQL queries executed by prometheus-adapter might be on different Prometheus servers.
This is scrape interval based metrics data, its not live profiling data so its expected that two scrapes at two different moments may return a different figure - oc adm top is a point in time command after all.
In comparison the Openshift console is graphing that data over time.
So again its not expected that a comparison between oc adm top and the console will match exactly.
From Openshift 4.12 there is an option: dedicatedServiceMonitors that is switched off by default that improves the consistancy of the result returned by the Prometheus adapter.
Diagnostic Steps
Its possible to confirm the queries used by the oc adm top command by checking the following configmap
$ oc get cm adapter-config -o yaml -n openshift-monitoring
In order to verify the output of the Prometheus queries, issue the following command:
# oc adm top nodes
Its also possible to compare The Prometheus metrics equivalent to "oc adm top pods" command in RHOCP4
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments