Chapter 6. Known issues

This section contains the known issues for AMQ Streams 1.4 on OpenShift.

6.1. Scaling ZooKeeper 3.5.7 up or down

There is a known issue related to scaling ZooKeeper up or down. Scaling ZooKeeper up means adding servers to a ZooKeeper cluster. Scaling ZooKeeper down means removing servers from a ZooKeeper cluster.

Kafka 2.4.0 requires ZooKeeper 3.5.7.

The configuration procedure for ZooKeeper 3.5.7 servers is significantly different than for ZooKeeper 3.4.x servers. Referred to as dynamic reconfiguration, the new configuration procedure requires that servers are added or removed using the ZooKeeper CLI or Admin API. This ensures that a stable ZooKeeper cluster is maintained during the scale up or scale down process.

To scale a ZooKeeper 3.5.7 cluster up or down, you must perform the procedures described in this known issue.

Note

In future AMQ Streams releases, ZooKeeper scale up and scale down will be handled by the Cluster Operator.

Scaling up ZooKeeper 3.5.7 servers in an AMQ Streams 1.4 cluster

This procedure assumes that:

  • AMQ Streams is running in the namespace namespace and the Kafka cluster is named my-cluster.
  • A 3 node ZooKeeper cluster is running.

Perform the following steps for each ZooKeeper server, one at a time:

  1. Edit the spec.zookeeper.replicas property in the Kafka custom resource. Set the replica count to 4 (n=4).

    apiVersion: kafka.strimzi.io/v1beta1
    kind: Kafka
    metadata:
      name: my-cluster
    spec:
      kafka:
      # ...
      zookeeper:
        replicas: 4
        storage:
          type: persistent-claim
          size: 100Gi
          deleteClaim: false
      # ...
  2. Allow the ZooKeeper server (zookeeper-3) to start up normally and establish a link to the existing quorum.

    You can check this by running:

    kubectl exec -n <namespace> -it <my-cluster>-zookeeper-3 -c zookeeper -- bash -c "echo 'srvr' | nc 127.0.0.1 21813 | grep 'Mode:'"

    The output of this command should be similar to: Mode: follower.

    Note

    The index number in the name of the new ZooKeeper node, zookeeper-x, matches the final number of the client port in the nc 127.0.0.1 2181x command.

  3. Open a zookeeper-shell session on one of the nodes in the original cluster (nodes 0, 1, or 2):

    kubectl exec -n <namespace> -it <my-cluster>-zookeeper-0 -c zookeeper -- ./bin/zookeeper-shell.sh localhost:21810
  4. In the shell session, enter the following line to add the new server to the quorum as a voting member:

    reconfig -add server.4=127.0.0.1:28883:38883:participant;127.0.0.1:21813
    Note

    Within the ZooKeeper cluster, nodes are indexed from one, not zero as in the node names. So, the new zookeeper-3 node is referred to as server.4 within the ZooKeeper configuration.

    This outputs the new cluster configuration:

    server.1=127.0.0.1:28880:38880:participant;127.0.0.1:21810
    server.2=127.0.0.1:28881:38881:participant;127.0.0.1:21811
    server.3=127.0.0.1:28882:38882:participant;127.0.0.1:21812
    server.4=127.0.0.1:28883:38883:participant;127.0.0.1:21813
    version=100000054

    The new configuration propagates to the other servers in the ZooKeeper cluster; the new server is now a full member of the quorum.

  5. In spec.zookeeper.replicas in the Kafka custom resource, increase the replica count by one (n=5).
  6. Allow the ZooKeeper server (zookeeper-<n-1>) to start up normally and establish a link to the existing quorum. You can check this by running:

    kubectl exec -n <namespace> -it <my-cluster>-zookeeper-<n-1> -c zookeeper -- bash -c "echo 'srvr' | nc 127.0.0.1 2181<n-1> | grep 'Mode:'"

    The output of the command should be similar to: Mode: follower.

  7. Open a zookeeper-shell session on one of the nodes in the original cluster (in this example, nodes 0 >= i <= n-2):

    kubectl exec -n <namespace> -it <my-cluster>-zookeeper-<i> -c zookeeper -- ./bin/zookeeper-shell.sh localhost:2181<i>
  8. In the shell session, enter the following line to add the new ZooKeeper server to the quorum as a voting member:

    reconfig -add server.<n>=127.0.0.1:2888<n-1>:3888<n-1>:participant;127.0.0.1:2181<n-1>
  9. Repeat steps 5-8 for every server you want to add.
  10. When you have a cluster of the desired size, you need to signal to the Cluster Operator that it is safe to roll the ZooKeeper cluster again. To do so, set the manual-zk-scaling annotation to false in the Kafka custom resource. The Cluster Operator automatically sets this to true when you change the number of ZooKeeper replicas.

    kubectl -n <namespace> annotate statefulset <my-cluster>-zookeeper strimzi.io/manual-zk-scaling=false --overwrite

Scaling down ZooKeeper 3.5.7 servers in an AMQ Streams 1.4 cluster

This procedure assumes that AMQ Streams is running in the namespace namespace and the Kafka cluster is named my-cluster.

When removing ZooKeeper nodes, the highest numbered server will be deleted first and so on in descending order. Therefore, if you have a 5 node cluster and want to scale down to 3, you would remove zookeeper-4 and zookeeper-3 and keep zookeeper-0, zookeeper-1, and zookeeper-2.

Note

Before proceeding, read the notes on "Removing servers" in the ZooKeeper documentation.

Perform the following steps for each ZooKeeper server, one at a time:

  1. Log in to the zookeeper-shell on one of the nodes that will be retained after the scale down:

    kubectl exec -n <namespace> -it <my-cluster>-zookeeper-0 -c zookeeper -- ./bin/zookeeper-shell.sh localhost:21810
    Note

    The index number in the ZooKeeper node’s name, zookeeper-x, matches the final number of the client port in the zookeeper-shell.sh localhost:2181x command.

  2. Output the existing cluster configuration using the config command:

    config

    Assuming you are scaling down from a cluster that had 5 ZooKeeper nodes, the output of the command should be similar to:

    server.1=127.0.0.1:28880:38880:participant;127.0.0.1:21810
    server.2=127.0.0.1:28881:38881:participant;127.0.0.1:21811
    server.3=127.0.0.1:28882:38882:participant;127.0.0.1:21812
    server.4=127.0.0.1:28883:38883:participant;127.0.0.1:21813
    server.5=127.0.0.1:28884:38884:participant;127.0.0.1:21814
    version=100000057
  3. Next, remove the highest numbered server first, which in this case is server.5:

    reconfig -remove 5

    This outputs the new configuration that will propagate to all other members of the quorum:

    server.1=127.0.0.1:28880:38880:participant;127.0.0.1:21810
    server.2=127.0.0.1:28881:38881:participant;127.0.0.1:21811
    server.3=127.0.0.1:28882:38882:participant;127.0.0.1:21812
    server.4=127.0.0.1:28883:38883:participant;127.0.0.1:21813
    version=200000012
  4. When the propagation is complete, the number of replicas for the zookeeper section of the Kafka resource can be reduced by one. This will shut down zookeeper-4 (server.5).
  5. Repeat steps 1-4 to incrementally reduce the cluster size. Remember to remove the servers in descending order.
  6. When you have a cluster of the desired size, you need to signal to the Cluster Operator that it is safe to roll the ZooKeeper cluster again. To do so, set the manual-zk-scaling annotation to false in the Kafka custom resource. The Cluster Operator automatically sets this to true when you change the number of ZooKeeper replicas.

    kubectl -n <namespace> annotate statefulset <my-cluster>-zookeeper strimzi.io/manual-zk-scaling=false --overwrite
    Note

    It is possible to specify multiple servers to be removed at once; for example, you could enter reconfig -remove 4,5 to remove the two highest numbered servers at once and scale down from 5 to 3 in one step. However, this can lead to instability and is NOT recommended.