How to get more details about the cluster in an Alert sent from Alertmanager in OCP 3.11?

Solution In Progress - Updated -

Issue

  • Prometheus Alerting Rules are generic and hence the alert should include more details of the component.

  • Below are two examples for the detailed explanation of issue:

    • The alert is NodeDiskRunningFull and the alert description has pod name specified , but in a big cluster with 100's and 1000's of nodes, including a Node name would be more informative to check for which particular node the alert is generated.
    Labels
    alertname = NodeDiskRunningFull
    cluster = <hostname>
    device = /dev/mapper/apvg01-ap1000
    namespace = openshift-monitoring
    pod = <node-exporter-pod-name>
    prometheus = openshift-monitoring/k8s
    severity = warning
    Annotations
    message = Device /dev/mapper/apvg01-ap1000 of node-exporter <openshift-monitoring/node-exporter-pod-name> is running full within the next 24 hours.
    Source
    
    • The alert is KubeAPIErrorsHigh and the error is specific to APIserver but the endpoint field displays partial information.
    Labels
    alertname = KubeAPIErrorsHigh
    client = openshift/v1.11.0+d4cacc0 (linux/amd64) kubernetes/d4cacc0/system:serviceaccount:openshift-infra:image-trigger-controller
    cluster = <hostname>
    code = 500
    contentType = application/vnd.kubernetes.protobuf
    endpoint = https
    job = apiserver
    namespace = default
    prometheus = openshift-monitoring/k8s
    resource = buildconfigs
    scope = namespace
    service = kubernetes
    severity = critical
    subresource = instantiate
    verb = POST
    Annotations
    message = API server is erroring for 100% of requests.
    Source
    

Environment

  • Red Hat OpenShift Container Platform

    • 3.11

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content