Chapter 9. Cruise Control for cluster rebalancing

Important

Cruise Control for cluster rebalancing is a Technology Preview only. Technology Preview features are not supported with Red Hat production service-level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend implementing any Technology Preview features in production environments. This Technology Preview feature provides early access to upcoming product innovations, enabling you to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

You can deploy Cruise Control to your AMQ Streams cluster and use it to rebalance the Kafka cluster.

Cruise Control is an open source system for automating Kafka operations, such as monitoring cluster workload, rebalancing a cluster based on predefined constraints, and detecting and fixing anomalies. It consists of four main components—​the Load Monitor, the Analyzer, the Anomaly Detector, and the Executor—​and a REST API for client interactions. AMQ Streams utilizes the REST API to support the following Cruise Control features:

  • Generating optimization proposals from multiple optimization goals.
  • Rebalancing a Kafka cluster based on an optimization proposal.

Other Cruise Control features are not currently supported, including anomaly detection, notifications, write-your-own goals, and changing the topic replication factor.

Example YAML files for Cruise Control are provided in examples/cruise-control/.

9.1. Why use Cruise Control?

Cruise Control reduces the time and effort involved in running an efficient and balanced Kafka cluster.

A typical cluster can become unevenly loaded over time. Partitions that handle large amounts of message traffic might be unevenly distributed across the available brokers. To rebalance the cluster, administrators must monitor the load on brokers and manually reassign busy partitions to brokers with spare capacity.

Cruise Control automates the cluster rebalancing process. It constructs a workload model of resource utilization for the cluster—​based on CPU, disk, and network load—​and generates optimization proposals (that you can approve or reject) for more balanced partition assignments. A set of configurable optimization goals is used to calculate these proposals.

When you approve an optimization proposal, Cruise Control applies it to your Kafka cluster. When the cluster rebalancing operation is complete, the broker pods are used more effectively and the Kafka cluster is more evenly balanced.

Additional resources

9.2. Optimization goals overview

To rebalance a Kafka cluster, Cruise Control uses optimization goals to generate optimization proposals, which you can approve or reject.

Optimization goals are constraints on workload redistribution and resource utilization across a Kafka cluster. With a few exceptions, AMQ Streams supports all the optimization goals developed in the Cruise Control project. The supported goals, in the default descending order of priority, are as follows:

  1. Rack-awareness
  2. Replica capacity
  3. Capacity: Disk capacity, Network inbound capacity, Network outbound capacity
  4. Replica distribution
  5. Potential network output
  6. Resource distribution: Disk utilization distribution, Network inbound utilization distribution, Network outbound utilization distribution
  7. Leader bytes-in rate distribution
  8. Topic replica distribution
  9. Leader replica distribution
  10. Preferred leader election

For more information on each optimization goal, see Goals in the Cruise Control Wiki.

Note

CPU goals, intra-broker disk goals, "Write your own" goals, and Kafka assigner goals are not yet supported.

Goals configuration in AMQ Streams custom resources

You configure optimization goals in Kafka and KafkaRebalance custom resources. Cruise Control has configurations for hard optimization goals that must be satisfied, as well as master, default, and user-provided optimization goals. Optimization goals are subject to any capacity limits on broker resources.

The following sections describe each goal configuration in more detail.

Hard goals and soft goals

Hard goals are goals that must be satisfied in optimization proposals. Goals that are not configured as hard goals are known as soft goals. You can think of soft goals as best effort goals: they do not need to be satisfied in optimization proposals, but are included in optimization calculations. An optimization proposal that violates one or more soft goals, but satisfies all hard goals, is valid.

Cruise Control will calculate optimization proposals that satisfy all the hard goals and as many soft goals as possible (in their priority order). An optimization proposal that does not satisfy all the hard goals is rejected by Cruise Control and not sent to the user for approval.

Note

For example, you might have a soft goal to distribute a topic’s replicas evenly across the cluster (the topic replica distribution goal). Cruise Control will ignore this goal if doing so enables all the configured hard goals to be met.

In Cruise Control, the following master optimization goals are preset as hard goals:

RackAwareGoal; ReplicaCapacityGoal; DiskCapacityGoal; NetworkInboundCapacityGoal; NetworkOutboundCapacityGoal

You configure hard goals in the Cruise Control deployment configuration, by editing the hard.goals property in Kafka.spec.cruiseControl.config.

  • To inherit the preset hard goals from Cruise Control, do not specify the hard.goals property in Kafka.spec.cruiseControl.config
  • To change the preset hard goals, specify the desired goals in the hard.goals property, using their fully-qualified domain names.

Example Kafka configuration for hard optimization goals

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    # ...
  zookeeper:
    # ...
  entityOperator:
    topicOperator: {}
    userOperator: {}
  cruiseControl:
    brokerCapacity:
      inboundNetwork: 10000KB/s
      outboundNetwork: 10000KB/s
    config:
      hard.goals: >
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal
      # ...

Increasing the number of configured hard goals will reduce the likelihood of Cruise Control generating valid optimization proposals.

If skipHardGoalCheck: true is specified in the KafkaRebalance custom resource, Cruise Control does not check that the list of user-provided optimization goals (in KafkaRebalance.spec.goals) contains all the configured hard goals (hard.goals). Therefore, if some, but not all, of the user-provided optimization goals are in the hard.goals list, Cruise Control will still treat them as hard goals even if skipHardGoalCheck: true is specified.

Master optimization goals

The master optimization goals are available to all users. Goals that are not listed in the master optimization goals are not available for use in Cruise Control operations.

Unless you change the Cruise Control deployment configuration, AMQ Streams will inherit the following master optimization goals from Cruise Control, in descending priority order:

RackAwareGoal; ReplicaCapacityGoal; DiskCapacityGoal; NetworkInboundCapacityGoal; NetworkOutboundCapacityGoal; ReplicaDistributionGoal; PotentialNwOutGoal; DiskUsageDistributionGoal; NetworkInboundUsageDistributionGoal; NetworkOutboundUsageDistributionGoal; TopicReplicaDistributionGoal; LeaderReplicaDistributionGoal; LeaderBytesInDistributionGoal; PreferredLeaderElectionGoal

Five of these goals are preset as hard goals.

To reduce complexity, we recommend that you use the inherited master optimization goals, unless you need to completely exclude one or more goals from use in KafkaRebalance resources. The priority order of the master optimization goals can be modified, if desired, in the configuration for default optimization goals.

You configure master optimization goals, if necessary, in the Cruise Control deployment configuration: Kafka.spec.cruiseControl.config.goals

  • To accept the inherited master optimization goals, do not specify the goals property in Kafka.spec.cruiseControl.config.
  • If you need to modify the inherited master optimization goals, specify a list of goals, in descending priority order, in the goals configuration option.
Note

If you change the inherited master optimization goals, you must ensure that the hard goals, if configured in the hard.goals property in Kafka.spec.cruiseControl.config, are a subset of the master optimization goals that you configured. Otherwise, errors will occur when generating optimization proposals.

Default optimization goals

Cruise Control uses the default optimization goals to generate the cached optimization proposal. For more information about the cached optimization proposal, see Section 9.3, “Optimization proposals overview”.

You can override the default optimization goals by setting user-provided optimization goals in a KafkaRebalance custom resource.

Unless you specify default.goals in the Cruise Control deployment configuration, the master optimization goals are used as the default optimization goals. In this case, the cached optimization proposal is generated using the master optimization goals.

  • To use the master optimization goals as the default goals, do not specify the default.goals property in Kafka.spec.cruiseControl.config.
  • To modify the default optimization goals, edit the default.goals property in Kafka.spec.cruiseControl.config. You must use a subset of the master optimization goals.

Example Kafka configuration for default optimization goals

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    # ...
  zookeeper:
    # ...
  entityOperator:
    topicOperator: {}
    userOperator: {}
  cruiseControl:
    brokerCapacity:
      inboundNetwork: 10000KB/s
      outboundNetwork: 10000KB/s
    config:
      default.goals: >
         com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
         com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,
         com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal
      # ...

If no default optimization goals are specified, the cached proposal is generated using the master optimization goals.

User-provided optimization goals

User-provided optimization goals narrow down the configured default goals for a particular optimization proposal. You can set them, as required, in spec.goals in the KafkaRebalance custom resource:

KafkaRebalance.spec.goals

User-provided optimization goals can generate optimization proposals for different scenarios. For example, you might want to optimize leader replica distribution across the Kafka cluster without considering disk capacity or disk utilization. So, you create a KafkaRebalance custom resource containing a single user-provided goal for leader replica distribution.

User-provided optimization goals must:

  • Include all configured hard goals, or an error occurs
  • Be a subset of the master optimization goals

To ignore the configured hard goals in an optimization proposal, add the skipHardGoalCheck: true option to the KafkaRebalance custom resource.

Additional resources

9.3. Optimization proposals overview

An optimization proposal is a summary of proposed changes that would produce a more balanced Kafka cluster, with partition workloads distributed more evenly among the brokers. Each optimization proposal is based on the set of optimization goals that was used to generate it, subject to any configured capacity limits on broker resources.

An optimization proposal is contained in the Status.Optimization Result property of a KafkaRebalance custom resource. The information provided is a summary of the full optimization proposal. Use the summary to decide whether to:

  • Approve the optimization proposal. This instructs Cruise Control to apply the proposal to the Kafka cluster and start a cluster rebalance operation.
  • Reject the optimization proposal. You can change the optimization goals and then generate another proposal.

All optimization proposals are dry runs: you cannot approve a cluster rebalance without first generating an optimization proposal. There is no limit to the number of optimization proposals that can be generated.

Cached optimization proposal

Cruise Control maintains a cached optimization proposal based on the configured default optimization goals. Generated from the workload model, the cached optimization proposal is updated every 15 minutes to reflect the current state of the Kafka cluster. If you generate an optimization proposal using the default optimization goals, Cruise Control returns the most recent cached proposal.

To change the cached optimization proposal refresh interval, edit the proposal.expiration.ms setting in the Cruise Control deployment configuration. Consider a shorter interval for fast changing clusters, although this increases the load on the Cruise Control server.

Contents of optimization proposals

The following table explains the properties contained in an optimization proposal:

JSON propertyDescription

numIntraBrokerReplicaMovements

The total number of partition replicas that will be transferred between the disks of the cluster’s brokers.

Performance impact during rebalance operation: Relatively high, but lower than numReplicaMovements.

excludedBrokersForLeadership

Not yet supported. An empty list is returned.

numReplicaMovements

The number of partition replicas that will be moved between separate brokers.

Performance impact during rebalance operation: Relatively high.

onDemandBalancednessScoreBefore, onDemandBalancednessScoreAfter

A measurement of the overall balancedness of a Kafka Cluster, before and after the optimization proposal was generated.

The score is calculated by subtracting the sum of the BalancednessScore of each violated soft goal from 100. Cruise Control assigns a BalancednessScore to every optimization goal based on several factors, including priority—​the goal’s position in the list of default.goals or user-provided goals.

The Before score is based on the current configuration of the Kafka cluster. The After score is based on the generated optimization proposal.

intraBrokerDataToMoveMB

The sum of the size of each partition replica that will be moved between disks on the same broker (see also numIntraBrokerReplicaMovements).

Performance impact during rebalance operation: Variable. The larger the number, the longer the cluster rebalance will take to complete. Moving a large amount of data between disks on the same broker has less impact than between separate brokers (see dataToMoveMB).

recentWindows

The number of metrics windows upon which the optimization proposal is based.

dataToMoveMB

The sum of the size of each partition replica that will be moved to a separate broker (see also numReplicaMovements).

Performance impact during rebalance operation: Variable. The larger the number, the longer the cluster rebalance will take to complete.

monitoredPartitionsPercentage

The percentage of partitions in the Kafka cluster covered by the optimization proposal. Affected by the number of excludedTopics.

excludedTopics

Not yet supported. An empty list is returned.

numLeaderMovements

The number of partitions whose leaders will be switched to different replicas. This involves a change to ZooKeeper configuration.

Performance impact during rebalance operation: Relatively low.

excludedBrokersForReplicaMove

Not yet supported. An empty list is returned.

9.4. Deploying Cruise Control

To deploy Cruise Control to your AMQ Streams cluster, define the configuration using the cruiseControl property in the Kafka resource, and then create or update the resource.

Deploy one instance of Cruise Control per Kafka cluster.

Prerequisites

  • An OpenShift cluster
  • A running Cluster Operator

Procedure

  1. Edit the Kafka resource and add the cruiseControl property.

    The properties you can configure are shown in this example configuration:

    apiVersion: kafka.strimzi.io/v1beta1
    kind: Kafka
    metadata:
      name: my-cluster
    spec:
      # ...
      cruiseControl:
        brokerCapacity: 1
          inboundNetwork: 10000KB/s
          outboundNetwork: 10000KB/s
          # ...
        config: 2
          default.goals: >
             com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
             com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal
             # ...
          cpu.balance.threshold: 1.1
          metadata.max.age.ms: 300000
          send.buffer.bytes: 131072
          # ...
        resources: 3
          requests:
            cpu: 200m
            memory: 64Mi
          limits:
            cpu: 500m
            memory: 128Mi
        logging: 4
            type: inline
            loggers:
              cruisecontrol.root.logger: "INFO"
        template: 5
          pod:
            metadata:
              labels:
                label1: value1
            securityContext:
              runAsUser: 1000001
              fsGroup: 0
            terminationGracePeriodSeconds: 120
        readinessProbe: 6
          initialDelaySeconds: 15
          timeoutSeconds: 5
        livenessProbe: 7
          initialDelaySeconds: 15
          timeoutSeconds: 5
    # ...
    1
    Specifies capacity limits for broker resources. For more information, see Capacity configuration.
    2
    Defines the Cruise Control configuration, including the default optimization goals (in default.goals) and any customizations to the master optimization goals (in goals) or the hard goals (in hard.goals). You can provide any standard Cruise Control configuration option apart from those managed directly by AMQ Streams. For more information on configuring optimization goals, see Section 9.2, “Optimization goals overview”.
    3
    CPU and memory resources reserved for Cruise Control. For more information, see Section 3.1.12, “CPU and memory resources”.
    4
    Defined loggers and log levels added directly (inline) or indirectly (external) through a ConfigMap. A custom ConfigMap must be placed under the log4j.properties key. Cruise Control has a single logger named cruisecontrol.root.logger. You can set the log level to INFO, ERROR, WARN, TRACE, DEBUG, FATAL or OFF. For more information, see Logging configuration.
    5
    6
    7
  2. Create or update the resource:

    oc apply -f kafka.yaml
  3. Verify that Cruise Control was successfully deployed:

    oc get deployments -l app.kubernetes.io/name=strimzi

Auto-created topics

The following table shows the three topics that are automatically created when Cruise Control is deployed. These topics are required for Cruise Control to work properly and must not be deleted or changed.

Auto-created topicCreated byFunction

strimzi.cruisecontrol.metrics

AMQ Streams Metrics Reporter

Stores the raw metrics from the Metrics Reporter in each Kafka broker.

strimzi.cruisecontrol.partitionmetricsamples

Cruise Control

Stores the derived metrics for each partition. These are created by the Metric Sample Aggregator.

strimzi.cruisecontrol.modeltrainingsamples

Cruise Control

Stores the metrics samples used to create the Cluster Workload Model.

To prevent the removal of records that are needed by Cruise Control, log compaction is disabled in the auto-created topics.

What to do next

After configuring and deploying Cruise Control, you can generate optimization proposals.

9.5. Cruise Control configuration

The config property in Kafka.spec.cruiseControl contains configuration options as keys with values as one of the following JSON types:

  • String
  • Number
  • Boolean
Note

Strings that look like JSON or YAML will need to be explicitly quoted.

You can specify and configure all the options listed in the "Configurations" section of the Cruise Control documentation, apart from those managed directly by AMQ Streams. Specifically, you cannot modify configuration options with keys equal to or starting with one of the following strings:

  • bootstrap.servers
  • zookeeper.
  • ssl.
  • security.
  • failed.brokers.zk.path
  • webserver.http.port
  • webserver.http.address
  • webserver.api.urlprefix
  • metric.reporter.sampler.bootstrap.servers
  • metric.reporter.topic
  • metric.reporter.topic.pattern
  • partition.metric.sample.store.topic
  • broker.metric.sample.store.topic
  • capacity.config.file
  • skip.sample.store.topic.rack.awareness.check
  • cruise.control.metrics.topic
  • sasl.

If restricted options are specified, they are ignored and a warning message is printed to the Cluster Operator log file. All the supported options are passed to Cruise Control.

An example Cruise Control configuration

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: my-cluster
spec:
  # ...
  cruiseControl:
    # ...
    config:
      default.goals: >
         com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
         com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal
      cpu.balance.threshold: 1.1
      metadata.max.age.ms: 300000
      send.buffer.bytes: 131072
    # ...

Capacity configuration

Cruise Control uses capacity limits to determine if certain resource-based optimization goals are being broken.

You specify capacity limits for Kafka broker resources in the brokerCapacity property in Kafka.spec.cruiseControl . Capacity limits can be set for the following broker resources in the described units:

  • disk - Disk storage in bytes
  • cpuUtilization - CPU utilization as a percent (0-100)
  • inboundNetwork - Inbound network throughput in bytes per second
  • outboundNetwork - Outbound network throughput in bytes per second

Because AMQ Streams Kafka brokers are homogeneous, Cruise Control applies the same capacity limits to every broker it is monitoring.

An example Cruise Control brokerCapacity configuration

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: my-cluster
spec:
  # ...
  cruiseControl:
    # ...
    brokerCapacity:
      disk: 100G
      cpuUtilization: 100
      inboundNetwork: 10000KB/s
      outboundNetwork: 10000KB/s
    # ...

Additional resources

For more information, refer to the Section B.67, “BrokerCapacity schema reference”.

Logging configuration

Cruise Control has its own configurable logger:

  • cruisecontrol.root.logger

Cruise Control uses the Apache log4j logger implementation.

Use the logging property to configure loggers and logger levels.

You can set the log levels by specifying the logger and level directly (inline) or use a custom (external) ConfigMap. If a ConfigMap is used, you set logging.name property to the name of the ConfigMap containing the external logging configuration. Inside the ConfigMap, the logging configuration is described using log4j.properties.

Here we see examples of inline and external logging.

Inline logging

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
# ...
spec:
  cruiseControl:
    # ...
    logging:
      type: inline
      loggers:
        cruisecontrol.root.logger: "INFO"
    # ...

External logging

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
# ...
spec:
  cruiseControl:
    # ...
    logging:
      type: external
      name: customConfigMap
    # ...

9.6. Generating optimization proposals

When you create or update a KafkaRebalance resource, Cruise Control generates an optimization proposal for the Kafka cluster based on the configured optimization goals.

Analyze the summary information in the optimization proposal and decide whether to approve it.

Prerequisites

Procedure

  1. Create a KafkaRebalance resource:

    1. To use the default optimization goals defined in the Kafka resource, leave the spec property empty:

      apiVersion: kafka.strimzi.io/v1alpha1
      kind: KafkaRebalance
      metadata:
        name: my-rebalance
        labels:
          strimzi.io/cluster: my-cluster
      spec: {}
    2. To configure user-provided optimization goals instead of using the default goals, add the goals property and enter one or more goals.

      In the following example, rack awareness and replica capacity are configured as user-provided optimization goals:

      apiVersion: kafka.strimzi.io/v1alpha1
      kind: KafkaRebalance
      metadata:
        name: my-rebalance
        labels:
          strimzi.io/cluster: my-cluster
      spec:
        goals:
          - RackAwareGoal
          - ReplicaCapacityGoal
  2. Create or update the resource:

    oc apply -f your-file

    The Cluster Operator requests the optimization proposal from Cruise Control. This might take a few minutes depending on the size of the Kafka cluster.

  3. Check the status of the KafkaRebalance resource:

    oc describe kafkarebalance rebalance-cr-name

    Cruise Control returns one of two statuses:

    • PendingProposal: The rebalance operator is polling the Cruise Control API to check if the optimization proposal is ready.
    • ProposalReady: The optimization proposal is ready for review and, if desired, approval. The optimization proposal is contained in the Status.Optimization Result property of the KafkaRebalance resource.
  4. Review the optimization proposal.

    oc describe kafkarebalance rebalance-cr-name

    Here is an example proposal:

    Status:
      Conditions:
        Last Transition Time:  2020-05-19T13:50:12.533Z
        Status:                ProposalReady
        Type:                  State
      Observed Generation:     1
      Optimization Result:
        Data To Move MB:  0
        Excluded Brokers For Leadership:
        Excluded Brokers For Replica Move:
        Excluded Topics:
        Intra Broker Data To Move MB:         0
        Monitored Partitions Percentage:      100
        Num Intra Broker Replica Movements:   0
        Num Leader Movements:                 0
        Num Replica Movements:                26
        On Demand Balancedness Score After:   81.8666802863978
        On Demand Balancedness Score Before:  78.01176356230222
        Recent Windows:                       1
      Session Id:                             05539377-ca7b-45ef-b359-e13564f1458c

    The properties in the Optimization Result section describe the pending cluster rebalance operation. For descriptions of each property, see Contents of optimization proposals.

9.7. Approving an optimization proposal

You can approve an optimization proposal generated by Cruise Control, if its status is ProposalReady. Cruise Control will then apply the optimization proposal to the Kafka cluster, reassigning partitions to brokers and changing partition leadership.

Caution

This is not a dry run. Before you approve an optimization proposal, you must:

Prerequisites

Procedure

Perform these steps for the optimization proposal that you want to approve:

  1. Unless the optimization proposal is newly generated, check that it is based on current information about the state of the Kafka cluster. To do so, refresh the optimization proposal to make sure it uses the latest cluster metrics:

    1. Annotate the KafkaRebalance resource in OpenShift with refresh:

      oc annotate kafkarebalance rebalance-cr-name strimzi.io/rebalance=refresh
    2. Check the status of the KafkaRebalance resource:

      oc describe kafkarebalance rebalance-cr-name
    3. Wait until the status changes to ProposalReady.
  2. Approve the optimization proposal that you want Cruise Control to apply.

    Annotate the KafkaRebalance resource in OpenShift:

    oc annotate kafkarebalance rebalance-cr-name strimzi.io/rebalance=approve
  3. The Cluster Operator detects the annotated resource and instructs Cruise Control to rebalance the Kafka cluster.
  4. Check the status of the KafkaRebalance resource:

    oc describe kafkarebalance rebalance-cr-name
  5. Cruise Control returns one of three statuses:

9.8. Stopping a cluster rebalance

Once started, a cluster rebalance operation might take some time to complete and affect the overall performance of the Kafka cluster.

If you want to stop a cluster rebalance operation that is in progress, apply the stop annotation to the KafkaRebalance custom resource. This instructs Cruise Control to finish the current batch of partition reassignments and then stop the rebalance. When the rebalance has stopped, completed partition reassignments have already been applied; therefore, the state of the Kafka cluster is different when compared to prior to the start of the rebalance operation. If further rebalancing is required, you should generate a new optimization proposal.

Note

The performance of the Kafka cluster in the intermediate (stopped) state might be worse than in the initial state.

Prerequisites

  • You have approved the optimization proposal by annotating the KafkaRebalance custom resource with approve.
  • The status of the KafkaRebalance custom resource is Rebalancing.

Procedure

  1. Annotate the KafkaRebalance resource in OpenShift:

    oc annotate kafkarebalance rebalance-cr-name strimzi.io/rebalance=stop
  2. Check the status of the KafkaRebalance resource:

    oc describe kafkarebalance rebalance-cr-name
  3. Wait until the status changes to Stopped.

9.9. Fixing problems with a KafkaRebalance resource

If an issue occurs when creating a KafkaRebalance resource or interacting with Cruise Control, the error is reported in the resource status, along with details of how to fix it. The resource also moves to the NotReady state.

To continue with the cluster rebalance operation, you must fix the problem in the KafkaRebalance resource itself. Problems might include the following:

  • A misconfigured parameter.
  • The Cruise Control server is not reachable.

After fixing the issue, you need to add the refresh annotation to the KafkaRebalance resource. During a “refresh”, a new optimization proposal is requested from the Cruise Control server.

Prerequisites

Procedure

  1. Get information about the error from the KafkaRebalance status:

    oc describe kafkarebalance rebalance-cr-name
  2. Attempt to resolve the issue in the KafkaRebalance resource.
  3. Annotate the KafkaRebalance resource in OpenShift:

    oc annotate kafkarebalance rebalance-cr-name strimzi.io/rebalance=refresh
  4. Check the status of the KafkaRebalance resource:

    oc describe kafkarebalance rebalance-cr-name
  5. Wait until the status changes to PendingProposal, or directly to ProposalReady.