tekton-resource-pruner job is failing and eventually causing KubeJobFailed alert to fire on OpenShift Container Platform 4

Solution Verified - Updated -

Issue

  • Every morning at about 08:15 UTC we are seeing the below Prometheus Alert firing because some tekton-resource-pruner job seems to fail.

    alertname = KubeJobFailed
    condition = true
    container = kube-rbac-proxy-main
    endpoint = https-main
    job = kube-state-metrics
    job_name = tekton-resource-pruner-lvjmq-27537600
    namespace = openshift-pipelines
    openshift_io_alert_source = platform
    prometheus = openshift-monitoring/k8s
    service = kube-state-metrics
    severity = warning
    Annotations
    description = Job openshift-pipelines/tekton-resource-pruner-lvjmq-12345678 failed to complete. Removing failed job after investigation should clear this alert.
    runbook_url = https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/KubeJobFailed.md
    summary = Job failed to complete.
    
  • tekton-resource-pruner job is failing and eventually causing KubeJobFailed alert to fire on OpenShift Container Platform 4

  • Automatic pruning of task run and pipeline run is failing

Environment

  • Red Hat OpenShift Container Platform 4
  • Red Hat OpenShift Pipelines 1.6.2 up to 1.8

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content