Prometheus node exporter container image not set for CephStorage nodes
Issue
-
After enabling the Ceph Dashboard we found out that host metrics are available only for controller nodes but not for CephStorage nodes: every host-related metric shows "no data points" and the storage nodes are not listed in the Grafana dashboard.
-
After looking at the affected nodes, we found that the node-exporter container wasn't running on all storage nodes because podman was trying to pull its image from the Red Hat CDN instead of our internal Satellite installation. Digging a little deeper in /var/lib/mistral/overcloud/ceph-ansible/group_vars we found out that the
node_exporter_container_image
variable was set only for thegrafana-server
group instead of theall
group as stated by the Red Hat Ceph installation documentation, so only the controller nodes had the right image while other nodes were picking up whathever image was set as default by ceph-ansible.
Environment
- Red Hat OpenStack Platform 16.1 (RHOSP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.