Chapter 13. Performance and reliability tuning

13.1. Flow control mechanisms

If logs are produced faster than they can be collected, it can be difficult to predict or control the volume of logs being sent to an output. Not being able to predict or control the volume of logs being sent to an output can result in logs being lost. If there is a system outage and log buffers are accumulated without user control, this can also cause long recovery times and high latency when the connection is restored.

As an administrator, you can limit logging rates by configuring flow control mechanisms for your logging.

13.1.1. Benefits of flow control mechanisms

  • The cost and volume of logging can be predicted more accurately in advance.
  • Noisy containers cannot produce unbounded log traffic that drowns out other containers.
  • Ignoring low-value logs reduces the load on the logging infrastructure.
  • High-value logs can be preferred over low-value logs by assigning higher rate limits.

13.1.2. Configuring rate limits

Rate limits are configured per collector, which means that the maximum rate of log collection is the number of collector instances multiplied by the rate limit.

Because logs are collected from each node’s file system, a collector is deployed on each cluster node. For example, in a 3-node cluster, with a maximum rate limit of 10 records per second per collector, the maximum rate of log collection is 30 records per second.

Because the exact byte size of a record as written to an output can vary due to transformations, different encodings, or other factors, rate limits are set in number of records instead of bytes.

You can configure rate limits in the ClusterLogForwarder custom resource (CR) in two ways:

Output rate limit
Limit the rate of outbound logs to selected outputs, for example, to match the network or storage capacity of an output. The output rate limit controls the aggregated per-output rate.
Input rate limit
Limit the per-container rate of log collection for selected containers.

13.1.3. Configuring log forwarder output rate limits

You can limit the rate of outbound logs to a specified output by configuring the ClusterLogForwarder custom resource (CR).

Prerequisites

  • You have installed the Red Hat OpenShift Logging Operator.
  • You have administrator permissions.

Procedure

  1. Add a maxRecordsPerSecond limit value to the ClusterLogForwarder CR for a specified output.

    The following example shows how to configure a per collector output rate limit for a Kafka broker output named kafka-example:

    Example ClusterLogForwarder CR

    apiVersion: logging.openshift.io/v1
    kind: ClusterLogForwarder
    metadata:
    # ...
    spec:
    # ...
      outputs:
        - name: kafka-example 1
          type: kafka 2
          limit:
            maxRecordsPerSecond: 1000000 3
    # ...

    1
    The output name.
    2
    The type of output.
    3
    The log output rate limit. This value sets the maximum Quantity of logs that can be sent to the Kafka broker per second. This value is not set by default. The default behavior is best effort, and records are dropped if the log forwarder cannot keep up. If this value is 0, no logs are forwarded.
  2. Apply the ClusterLogForwarder CR:

    Example command

    $ oc apply -f <filename>.yaml

Additional resources

13.1.4. Configuring log forwarder input rate limits

You can limit the rate of incoming logs that are collected by configuring the ClusterLogForwarder custom resource (CR). You can set input limits on a per-container or per-namespace basis.

Prerequisites

  • You have installed the Red Hat OpenShift Logging Operator.
  • You have administrator permissions.

Procedure

  1. Add a maxRecordsPerSecond limit value to the ClusterLogForwarder CR for a specified input.

    The following examples show how to configure input rate limits for different scenarios:

    Example ClusterLogForwarder CR that sets a per-container limit for containers with certain labels

    apiVersion: logging.openshift.io/v1
    kind: ClusterLogForwarder
    metadata:
    # ...
    spec:
    # ...
      inputs:
        - name: <input_name> 1
          application:
            selector:
              matchLabels: { example: label } 2
          containerLimit:
              maxRecordsPerSecond: 0 3
    # ...

    1
    The input name.
    2
    A list of labels. If these labels match labels that are applied to a pod, the per-container limit specified in the maxRecordsPerSecond field is applied to those containers.
    3
    Configures the rate limit. Setting the maxRecordsPerSecond field to 0 means that no logs are collected for the container. Setting the maxRecordsPerSecond field to some other value means that a maximum of that number of records per second are collected for the container.

    Example ClusterLogForwarder CR that sets a per-container limit for containers in selected namespaces

    apiVersion: logging.openshift.io/v1
    kind: ClusterLogForwarder
    metadata:
    # ...
    spec:
    # ...
      inputs:
        - name: <input_name> 1
          application:
            namespaces: [ example-ns-1, example-ns-2 ] 2
          containerLimit:
            maxRecordsPerSecond: 10 3
        - name: <input_name>
          application:
            namespaces: [ test ]
          containerLimit:
              maxRecordsPerSecond: 1000
    # ...

    1
    The input name.
    2
    A list of namespaces. The per-container limit specified in the maxRecordsPerSecond field is applied to all containers in the namespaces listed.
    3
    Configures the rate limit. Setting the maxRecordsPerSecond field to 10 means that a maximum of 10 records per second are collected for each container in the namespaces listed.
  2. Apply the ClusterLogForwarder CR:

    Example command

    $ oc apply -f <filename>.yaml

13.2. Filtering logs by content

Collecting all logs from a cluster might produce a large amount of data, which can be expensive to transport and store.

You can reduce the volume of your log data by filtering out low priority data that does not need to be stored. Logging provides content filters that you can use to reduce the volume of log data.

Note

Content filters are distinct from input selectors. input selectors select or ignore entire log streams based on source metadata. Content filters edit log streams to remove and modify records based on the record content.

Log data volume can be reduced by using one of the following methods:

13.2.1. Configuring content filters to drop unwanted log records

When the drop filter is configured, the log collector evaluates log streams according to the filters before forwarding. The collector drops unwanted log records that match the specified configuration.

Prerequisites

  • You have installed the Red Hat OpenShift Logging Operator.
  • You have administrator permissions.
  • You have created a ClusterLogForwarder custom resource (CR).

Procedure

  1. Add a configuration for a filter to the filters spec in the ClusterLogForwarder CR.

    The following example shows how to configure the ClusterLogForwarder CR to drop log records based on regular expressions:

    Example ClusterLogForwarder CR

    apiVersion: logging.openshift.io/v1
    kind: ClusterLogForwarder
    metadata:
    # ...
    spec:
      filters:
      - name: <filter_name>
        type: drop 1
        drop: 2
          test: 3
          - field: .kubernetes.labels."foo-bar/baz" 4
            matches: .+ 5
          - field: .kubernetes.pod_name
            notMatches: "my-pod" 6
      pipelines:
      - name: <pipeline_name> 7
        filterRefs: ["<filter_name>"]
    # ...

    1
    Specifies the type of filter. The drop filter drops log records that match the filter configuration.
    2
    Specifies configuration options for applying the drop filter.
    3
    Specifies the configuration for tests that are used to evaluate whether a log record is dropped.
    • If all the conditions specified for a test are true, the test passes and the log record is dropped.
    • When multiple tests are specified for the drop filter configuration, if any of the tests pass, the record is dropped.
    • If there is an error evaluating a condition, for example, the field is missing from the log record being evaluated, that condition evaluates to false.
    4
    Specifies a dot-delimited field path, which is a path to a field in the log record. The path can contain alpha-numeric characters and underscores (a-zA-Z0-9_), for example, .kubernetes.namespace_name. If segments contain characters outside of this range, the segment must be in quotes, for example, .kubernetes.labels."foo.bar-bar/baz". You can include multiple field paths in a single test configuration, but they must all evaluate to true for the test to pass and the drop filter to be applied.
    5
    Specifies a regular expression. If log records match this regular expression, they are dropped. You can set either the matches or notMatches condition for a single field path, but not both.
    6
    Specifies a regular expression. If log records do not match this regular expression, they are dropped. You can set either the matches or notMatches condition for a single field path, but not both.
    7
    Specifies the pipeline that the drop filter is applied to.
  2. Apply the ClusterLogForwarder CR by running the following command:

    $ oc apply -f <filename>.yaml

Additional examples

The following additional example shows how you can configure the drop filter to only keep higher priority log records:

apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
# ...
spec:
  filters:
  - name: important
    type: drop
    drop:
      test:
      - field: .message
        notMatches: "(?i)critical|error"
      - field: .level
        matches: "info|warning"
# ...

In addition to including multiple field paths in a single test configuration, you can also include additional tests that are treated as OR checks. In the following example, records are dropped if either test configuration evaluates to true. However, for the second test configuration, both field specs must be true for it to be evaluated to true:

apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
# ...
spec:
  filters:
  - name: important
    type: drop
    drop:
      test:
      - field: .kubernetes.namespace_name
        matches: "^open"
      test:
      - field: .log_type
        matches: "application"
      - field: .kubernetes.pod_name
        notMatches: "my-pod"
# ...

13.2.2. Configuring content filters to prune log records

When the prune filter is configured, the log collector evaluates log streams according to the filters before forwarding. The collector prunes log records by removing low value fields such as pod annotations.

Prerequisites

  • You have installed the Red Hat OpenShift Logging Operator.
  • You have administrator permissions.
  • You have created a ClusterLogForwarder custom resource (CR).

Procedure

  1. Add a configuration for a filter to the prune spec in the ClusterLogForwarder CR.

    The following example shows how to configure the ClusterLogForwarder CR to prune log records based on field paths:

    Important

    If both are specified, records are pruned based on the notIn array first, which takes precedence over the in array. After records have been pruned by using the notIn array, they are then pruned by using the in array.

    Example ClusterLogForwarder CR

    apiVersion: logging.openshift.io/v1
    kind: ClusterLogForwarder
    metadata:
    # ...
    spec:
      filters:
      - name: <filter_name>
        type: prune 1
        prune: 2
          in: [.kubernetes.annotations, .kubernetes.namespace_id] 3
          notIn: [.kubernetes,.log_type,.message,."@timestamp"] 4
      pipelines:
      - name: <pipeline_name> 5
        filterRefs: ["<filter_name>"]
    # ...

    1
    Specify the type of filter. The prune filter prunes log records by configured fields.
    2
    Specify configuration options for applying the prune filter. The in and notIn fields are specified as arrays of dot-delimited field paths, which are paths to fields in log records. These paths can contain alpha-numeric characters and underscores (a-zA-Z0-9_), for example, .kubernetes.namespace_name. If segments contain characters outside of this range, the segment must be in quotes, for example, .kubernetes.labels."foo.bar-bar/baz".
    3
    Optional: Any fields that are specified in this array are removed from the log record.
    4
    Optional: Any fields that are not specified in this array are removed from the log record.
    5
    Specify the pipeline that the prune filter is applied to.
  2. Apply the ClusterLogForwarder CR by running the following command:

    $ oc apply -f <filename>.yaml

13.2.3. Additional resources

13.3. Filtering logs by metadata

You can filter logs in the ClusterLogForwarder CR to select or ignore an entire log stream based on the metadata by using the input selector. As an administrator or developer, you can include or exclude the log collection to reduce the memory and CPU load on the collector.

Important

You can use this feature only if the Vector collector is set up in your logging deployment.

Note

input spec filtering is different from content filtering. input selectors select or ignore entire log streams based on the source metadata. Content filters edit the log streams to remove and modify the records based on the record content.

13.3.1. Filtering application logs at input by including or excluding the namespace or container name

You can include or exclude the application logs based on the namespace and container name by using the input selector.

Prerequisites

  • You have installed the Red Hat OpenShift Logging Operator.
  • You have administrator permissions.
  • You have created a ClusterLogForwarder custom resource (CR).

Procedure

  1. Add a configuration to include or exclude the namespace and container names in the ClusterLogForwarder CR.

    The following example shows how to configure the ClusterLogForwarder CR to include or exclude namespaces and container names:

    Example ClusterLogForwarder CR

    apiVersion: "logging.openshift.io/v1"
    kind: ClusterLogForwarder
    # ...
    spec:
      inputs:
        - name: mylogs
          application:
            includes:
              - namespace: "my-project" 1
                container: "my-container" 2
            excludes:
              - container: "other-container*" 3
                namespace: "other-namespace" 4
    # ...

    1
    Specifies that the logs are only collected from these namespaces.
    2
    Specifies that the logs are only collected from these containers.
    3
    Specifies the pattern of namespaces to ignore when collecting the logs.
    4
    Specifies the set of containers to ignore when collecting the logs.
  2. Apply the ClusterLogForwarder CR by running the following command:

    $ oc apply -f <filename>.yaml

The excludes option takes precedence over includes.

13.3.2. Filtering application logs at input ny including the label expressions or a matching label key and values

You can include the application logs based on the label expressions or a matching label key and its values by using the input selector.

Prerequisites

  • You have installed the Red Hat OpenShift Logging Operator.
  • You have administrator permissions.
  • You have created a ClusterLogForwarder custom resource (CR).

Procedure

  1. Add a configuration for a filter to the input spec in the ClusterLogForwarder CR.

    The following example shows how to configure the ClusterLogForwarder CR to include logs based on label expressions or matched label key/values:

    Example ClusterLogForwarder CR

    apiVersion: "logging.openshift.io/v1"
    kind: ClusterLogForwarder
    # ...
    spec:
      inputs:
        - name: mylogs
          application:
            selector:
              matchExpressions:
              - key: env 1
                operator: In 2
                values: [“prod”, “qa”] 3
              - key: zone
                operator: NotIn
                values: [“east”, “west”]
              matchLabels: 4
                app: one
                name: app1
    # ...

    1
    Specifies the label key to match.
    2
    Specifies the operator. Valid values include: In, NotIn, Exists, and DoesNotExist.
    3
    Specifies an array of string values. If the operator value is either Exists or DoesNotExist, the value array must be empty.
    4
    Specifies an exact key or value mapping.
  2. Apply the ClusterLogForwarder CR by running the following command:

    $ oc apply -f <filename>.yaml

13.3.3. Filtering the audit and infrastructure log inputs by source

You can define the list of audit and infrastructure sources to collect the logs by using the input selector.

Prerequisites

  • You have installed the Red Hat OpenShift Logging Operator.
  • You have administrator permissions.
  • You have created a ClusterLogForwarder custom resource (CR).

Procedure

  1. Add a configuration to define the audit and infrastructure sources in the ClusterLogForwarder CR.

    The following example shows how to configure the ClusterLogForwarder CR to define aduit and infrastructure sources:

    Example ClusterLogForwarder CR

    apiVersion: "logging.openshift.io/v1"
    kind: ClusterLogForwarder
    # ...
    spec:
      inputs:
        - name: mylogs1
          infrastructure:
            sources: 1
              - node
        - name: mylogs2
          audit:
            sources: 2
              - kubeAPI
              - openshiftAPI
              - ovn
    # ...

    1
    Specifies the list of infrastructure sources to collect. The valid sources include:
    • node: Journal log from the node
    • container: Logs from the workloads deployed in the namespaces
    2
    Specifies the list of audit sources to collect. The valid sources include:
    • kubeAPI: Logs from the Kubernetes API servers
    • openshiftAPI: Logs from the OpenShift API servers
    • auditd: Logs from a node auditd service
    • ovn: Logs from an open virtual network service
  2. Apply the ClusterLogForwarder CR by running the following command:

    $ oc apply -f <filename>.yaml