Streams for Apache Kafka minimum sizing guide for an OpenShift development environment

Solution Verified - Updated -

Environment

  • Streams for Apache Kafka (Streams)

Issue

  • What is a good starting point for the right sizing of my environments?
  • How can I set CPU, memory and storage resources to have a stable development environment?

Resolution

Proper resource sizing is crucial for maintaining stability and optimizing performance in Apache Kafka. Since resource allocation depends on specific requirements and use cases, no default CPU or memory values are set in Streams for Apache Kafka. When configuring resources and storage, consider factors such as message throughput and size, number of topics and partitions, replication factor, number of producers and consumer groups, network threads, and data retention settings.

For stateful components like Kafka brokers, we recommend setting a CPU request without a limit to leverage excess CPU capacity, and setting memory requests equal to limits to reserve the necessary memory upfront. In OpenShift, resource quotas (ResourceQuota) require that each container explicitly specifies resource limits, while limit ranges (LimitRange) can apply default values if resources are not set. For large Kafka clusters, we also suggest increasing the terminationGracePeriodSeconds value (default: 30 seconds) to allow brokers enough time for a clean shutdown.

Example configuration

The following examples are NOT suitable for production or shared test environments but serve as a helpful starting point for development sizing recommendations.

LTS release (ZK mode)

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    replicas: 3
    config:
      num.partitions: 3
      default.replication.factor: 3
      min.insync.replicas: 2
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
    resources:
      limits:
        cpu: 1000m
        memory: 2Gi
      requests:
        cpu: 1000m
        memory: 2Gi
    storage:
      size: 100Gi
      type: persistent-claim
      deleteClaim: false
    template:
      pod:
        terminationGracePeriodSeconds: 60
  zookeeper:
    replicas: 3
    resources:
      limits:
        cpu: 500m
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 1Gi
    storage:
      size: 10Gi
      type: persistent-claim
      deleteClaim: false
  entityOperator:
    topicOperator:
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi
    userOperator:
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi

Latest release (KRaft mode)

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
  name: controller
  labels:
    strimzi.io/cluster: my-cluster
spec:
  replicas: 3
  roles:
    - controller
  resources:
    limits:
      cpu: 500m
      memory: 1Gi
    requests:
      cpu: 500m
      memory: 1Gi
  storage:
    size: 10Gi
    type: persistent-claim
    deleteClaim: false
---
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaNodePool
metadata:
  name: broker
  labels:
    strimzi.io/cluster: my-cluster
spec:
  replicas: 3
  roles:
    - broker
  resources:
    limits:
      cpu: 1000m
      memory: 2Gi
    requests:
      cpu: 1000m
      memory: 2Gi
  storage:
    size: 100Gi
    type: persistent-claim
    deleteClaim: false
---
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
  annotations:
    strimzi.io/node-pools: enabled
    strimzi.io/kraft: enabled
spec:
  kafka:
    config:
      num.partitions: 3
      default.replication.factor: 3
      min.insync.replicas: 2
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
  entityOperator:
    topicOperator:
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi
    userOperator:
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 256Mi

Diagnostic Steps

Insufficient resources can cause pods to enter a CrashLoopBackOff state or lead to degraded broker performance.
Check pod status, monitor resource usage, and review container and broker logs for signs of pressure on resources.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments