Streams for Apache Kafka minimum sizing guide for an OpenShift development environment

Solution Verified - Updated -

Environment

  • Streams for Apache Kafka (Streams)

Issue

  • What is a good starting point for the right sizing of my environments?
  • How can I set CPU, memory and storage resources to have a stable development environment?

Resolution

Proper resource sizing is crucial for maintaining stability and optimizing performance in Apache Kafka. Since resource allocation depends on specific requirements and use cases, no default CPU or memory values are set in Streams for Apache Kafka. When configuring resources and storage, consider factors such as message throughput and size, number of topics and partitions, replication factor, number of producers and consumer groups, network threads, and data retention settings.

For stateful components like Kafka brokers, we recommend setting a CPU request without a limit to leverage excess CPU capacity, and setting memory requests equal to limits to reserve the necessary memory upfront. In OpenShift, resource quotas (ResourceQuota) require that each container explicitly specifies resource limits, while limit ranges (LimitRange) can apply default values if resources are not set. For large Kafka clusters, we also suggest increasing the terminationGracePeriodSeconds value (default: 30 seconds) to allow brokers enough time for a clean shutdown.

Example configuration

The following examples are NOT suitable for production or shared test environments but serves as a helpful starting point for development sizing recommendations.

LTS release

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    replicas: 3
    config:
      num.partitions: 3
      default.replication.factor: 3
      min.insync.replicas: 2
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
    resources:
      limits:
        memory: 2Gi
      requests:
        cpu: 1000m
        memory: 2Gi
    storage:
      size: 10Gi
      type: persistent-claim
      deleteClaim: false
    template:
      pod:
        terminationGracePeriodSeconds: 60
  zookeeper:
    replicas: 3
    resources:
      limits:
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 1Gi
    storage:
      size: 5Gi
      type: persistent-claim
      deleteClaim: false
  entityOperator:
    topicOperator:
      resources:
        limits:
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi
    userOperator:
      resources:
        limits:
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi

Latest release

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
  name: kafka
  labels:
    strimzi.io/cluster: my-cluster
spec:
  replicas: 3
  roles:
    - broker
  resources:
    limits:
      memory: 2Gi
    requests:
      cpu: 1000m
      memory: 2Gi
  storage:
    size: 10Gi
    type: persistent-claim
    deleteClaim: false
---
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
  annotations:
    strimzi.io/node-pools: enabled
spec:
  kafka:
    config:
      num.partitions: 3
      default.replication.factor: 3
      min.insync.replicas: 2
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
    template:
      pod:
        terminationGracePeriodSeconds: 60
  zookeeper:
    replicas: 3
    resources:
      limits:
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 1Gi
    storage:
      size: 5Gi
      type: persistent-claim
      deleteClaim: false
  entityOperator:
    topicOperator:
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi
    userOperator:
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi

Diagnostic Steps

Insufficient resources can cause pods to enter a CrashLoopBackOff state or lead to broker performance degradation.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments