Language:
Format:

Chapter 3. Monitoring Camel K operator

Red Hat Integration - Camel K monitoring is based on the OpenShift monitoring system. This chapter explains how to use the available options for monitoring Red Hat Integration - Camel K operator at runtime. You can use the Prometheus Operator that is already deployed as part of OpenShift Monitoring to monitor your own applications.

3.1. Camel K Operator metrics

The Camel K operator monitoring endpoint exposes the following metrics:

Table 3.1. Camel K operator metrics

Name	Type	Description	Buckets	Labels
`camel_k_reconciliation_duration_seconds`	`HistogramVec`	Reconciliation request duration	0.25s, 0.5s, 1s, 5s	`namespace`, `group`, `version`, `kind`, `result`: `Reconciled`\|`Errored`\|`Requeued`, `tag`: `""`\|`PlatformError`\|`UserError`
`camel_k_build_duration_seconds`	`HistogramVec`	Build duration	30s, 1m, 1.5m, 2m, 5m, 10m	`result`: `Succeeded`\|`Error`
`camel_k_build_recovery_attempts`	`Histogram`	Build recovery attempts	0, 1, 2, 3, 4, 5	`result`: `Succeeded`\|`Error`
`camel_k_build_queue_duration_seconds`	`Histogram`	Build queue duration	5s, 15s, 30s, 1m, 5m,	N/A
`camel_k_integration_first_readiness_seconds`	`Histogram`	Time to first integration readiness	5s, 10s, 30s, 1m, 2m	N/A

3.2. Enabling Camel K Operator monitoring

OpenShift 4.3 or higher includes an embedded Prometheus Operator already deployed as part of OpenShift Monitoring. This section explains how to enable monitoring of your own application services in OpenShift Monitoring.

Prerequisites

You must have cluster administrator access to an OpenShift cluster on which the Camel K Operator is installed. See Installing Camel K.
You must have already enabled monitoring of your own services in OpenShift. See Enabling user workload monitoring in OpenShift.

Procedure

Create a PodMonitor resource targeting the operator metrics endpoint, so that the Prometheus server can scrape the metrics exposed by the operator.

operator-pod-monitor.yaml

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: camel-k-operator
  labels:
    app: "camel-k"
    camel.apache.org/component: operator
spec:
  selector:
    matchLabels:
      app: "camel-k"
      camel.apache.org/component: operator
  podMetricsEndpoints:
    - port: metrics

Create PodMonitor resource.
```
oc apply -f operator-pod-monitor.yaml
```

Additional Resources

For more information about the discovery mechanism and the relationship between the operator resources see Prometheus Operator getting started guide.
In case your operator metrics are not discovered, you can find more information in Troubleshooting ServiceMonitor changes, which also applies to PodMonitor resources troubleshooting.

3.3. Camel K operator alerts

You can create a PrometheusRule resource so that the AlertManager instance from the OpenShift monitoring stack can trigger alerts, based on the metrics exposed by the Camel K operator.

Example

You can create a PrometheusRule resource with alerting rules based on the exposed metrics as shown below.

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
 name: camel-k-operator
spec:
 groups:
   - name: camel-k-operator
     rules:
       - alert: CamelKReconciliationDuration
         expr: |
           (
           1 - sum(rate(camel_k_reconciliation_duration_seconds_bucket{le="0.5"}[5m])) by (job)
           /
           sum(rate(camel_k_reconciliation_duration_seconds_count[5m])) by (job)
           )
           * 100
           > 10
         for: 1m
         labels:
           severity: warning
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the reconciliation requests
             for {{ $labels.job }} have their duration above 0.5s.
       - alert: CamelKReconciliationFailure
         expr: |
           sum(rate(camel_k_reconciliation_duration_seconds_count{result="Errored"}[5m])) by (job)
           /
           sum(rate(camel_k_reconciliation_duration_seconds_count[5m])) by (job)
           * 100
           > 1
         for: 10m
         labels:
           severity: warning
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the reconciliation requests
             for {{ $labels.job }} have failed.
       - alert: CamelKSuccessBuildDuration2m
         expr: |
           (
           1 - sum(rate(camel_k_build_duration_seconds_bucket{le="120",result="Succeeded"}[5m])) by (job)
           /
           sum(rate(camel_k_build_duration_seconds_count{result="Succeeded"}[5m])) by (job)
           )
           * 100
           > 10
         for: 1m
         labels:
           severity: warning
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the successful builds
             for {{ $labels.job }} have their duration above 2m.
       - alert: CamelKSuccessBuildDuration5m
         expr: |
           (
           1 - sum(rate(camel_k_build_duration_seconds_bucket{le="300",result="Succeeded"}[5m])) by (job)
           /
           sum(rate(camel_k_build_duration_seconds_count{result="Succeeded"}[5m])) by (job)
           )
           * 100
           > 1
         for: 1m
         labels:
           severity: critical
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the successful builds
             for {{ $labels.job }} have their duration above 5m.
       - alert: CamelKBuildFailure
         expr: |
           sum(rate(camel_k_build_duration_seconds_count{result="Failed"}[5m])) by (job)
           /
           sum(rate(camel_k_build_duration_seconds_count[5m])) by (job)
           * 100
           > 1
         for: 10m
         labels:
           severity: warning
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the builds for {{ $labels.job }} have failed.
       - alert: CamelKBuildError
         expr: |
           sum(rate(camel_k_build_duration_seconds_count{result="Error"}[5m])) by (job)
           /
           sum(rate(camel_k_build_duration_seconds_count[5m])) by (job)
           * 100
           > 1
         for: 10m
         labels:
           severity: critical
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the builds for {{ $labels.job }} have errored.
       - alert: CamelKBuildQueueDuration1m
         expr: |
           (
           1 - sum(rate(camel_k_build_queue_duration_seconds_bucket{le="60"}[5m])) by (job)
           /
           sum(rate(camel_k_build_queue_duration_seconds_count[5m])) by (job)
           )
           * 100
           > 1
         for: 1m
         labels:
           severity: warning
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the builds for {{ $labels.job }}
             have been queued for more than 1m.
       - alert: CamelKBuildQueueDuration5m
         expr: |
           (
           1 - sum(rate(camel_k_build_queue_duration_seconds_bucket{le="300"}[5m])) by (job)
           /
           sum(rate(camel_k_build_queue_duration_seconds_count[5m])) by (job)
           )
           * 100
           > 1
         for: 1m
         labels:
           severity: critical
         annotations:
           message: |
             {{ printf "%0.0f" $value }}% of the builds for {{ $labels.job }}
             have been queued for more than 5m.

Camel K operator alerts

Following table shows the alerting rules that are defined in the PrometheusRule resource.

Name	Severity	Description
`CamelKReconciliationDuration`	warning	More than 10% of the reconciliation requests have their duration above 0.5s over at least 1 min.
`CamelKReconciliationFailure`	warning	More than 1% of the reconciliation requests have failed over at least 10 min.
`CamelKSuccessBuildDuration2m`	warning	More than 10% of the successful builds have their duration above 2 min over at least 1 min.
`CamelKSuccessBuildDuration5m`	critical	More than 1% of the successful builds have their duration above 5 min over at least 1 min.
`CamelKBuildError`	critical	More than 1% of the builds have errored over at least 10 min.
`CamelKBuildQueueDuration1m`	warning	More than 1% of the builds have been queued for more than 1 min over at least 1 min.
`CamelKBuildQueueDuration5m`	critical	More than 1% of the builds have been queued for more than 5 min over at least 1 min.

You can find more information about alerts in Creating alerting rules from the OpenShift documentation.

Select Your Language

Chapter 3. Monitoring Camel K operator

3.1. Camel K Operator metrics

3.2. Enabling Camel K Operator monitoring

3.3. Camel K operator alerts

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Language and Page Formatting Options

Chapter 3. Monitoring Camel K operator

3.1. Camel K Operator metrics

3.2. Enabling Camel K Operator monitoring

3.3. Camel K operator alerts

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links