Prometheus alerts show wrong links when their queries include numbers in scientific notation

Solution Verified - Updated -

Environment

  • Red Hat OpenShift Container Platform 4.x.

Issue

This problem can be reproduced when there is a PrometheusRule of which queries use scientific notation like the one below:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: trident-rules
  namespace: ns1
spec:
  groups:
    - name: trident-alert.rules
      rules:
        - alert: TridentPersistentVolumeUsage
          annotations:
            description: >-
              PVC {{ $labels.persistentvolumeclaim }} utilization is {{ $value }}%. Free up some space or expand the PVC.
            message: >-
              PVC {{ $labels.persistentvolumeclaim }} is nearing full. Data deletion or PVC expansion is required.
            severity_level: warning
            storage_type: trident
          expr: >
            (round(kubelet_volume_stats_used_bytes / (kubelet_volume_stats_capacity_bytes <= 2.147483648e10) * 100, 0.1) > 60) or
            (round(kubelet_volume_stats_used_bytes / (2.147483648e10 < kubelet_volume_stats_capacity_bytes <= 1.073741824e11) * 100, 0.1) > 80) or
            (round(kubelet_volume_stats_used_bytes / (1.073741824e11 < kubelet_volume_stats_capacity_bytes <= 5.49755813888e12) * 100, 0.1) > 90) or
            (round(kubelet_volume_stats_used_bytes / (5.49755813888e12 < kubelet_volume_stats_capacity_bytes <= 2.199023255552e13) * 100, 0.1) > 92) or
            (round(kubelet_volume_stats_used_bytes / (kubelet_volume_stats_capacity_bytes > 2.199023255552e13) * 100, 0.1) > 95)
          for: 5s
          labels:
            severity: warning
        - alert: TridentPersistentVolumeUsage
          annotations:
            description: >-
              PVC {{ $labels.persistentvolumeclaim }} utilization is {{ $value }}%. Free up some space or expand the PVC immediately.
            message: >-
              PVC {{ $labels.persistentvolumeclaim }} is critically full. Data deletion or PVC expansion is required.
            severity_level: error
            storage_type: trident
          expr: >
            (round(kubelet_volume_stats_used_bytes / (kubelet_volume_stats_capacity_bytes <= 2.147483648e10) * 100, 0.1) > 80) or
            (round(kubelet_volume_stats_used_bytes / (2.147483648e10 < kubelet_volume_stats_capacity_bytes <= 1.073741824e11) * 100, 0.1) > 90) or
            (round(kubelet_volume_stats_used_bytes / (1.073741824e11 < kubelet_volume_stats_capacity_bytes <= 5.49755813888e12) * 100, 0.1) > 95) or
            (round(kubelet_volume_stats_used_bytes / (5.49755813888e12 < kubelet_volume_stats_capacity_bytes <= 2.199023255552e13) * 100, 0.1) > 97) or
            (round(kubelet_volume_stats_used_bytes / (kubelet_volume_stats_capacity_bytes > 2.199023255552e13) * 100, 0.1) > 98)
          for: 5s
          labels:
            severity: critical

When an alert is notified for example by e-mail, a URL starting with https://console-openshift-console.<cluster_domain>/monitoring/graph?g0.expr will point to the OpenShift Prometheus console; but the queries will be wrong there because they will include blank spaces between the numbers and the characters e. In this case the following:

Metrics screenshot

Resolution

The bug has been reported here: OCPBUGS-24327.

In case the issue is experienced because it has not been fixed for a specific version, the workaround would be to delete the blank spaces manually on the Prometheus console.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments