Chapter 7. Known issues

This section lists the known issues for AMQ Streams 2.2 on OpenShift.

7.1. AMQ Streams Cluster Operator on IPv6 clusters

The AMQ Streams Cluster Operator does not start on Internet Protocol version 6 (IPv6) clusters.

Workaround

There are two workarounds for this issue.

Workaround one: Set the KUBERNETES_MASTER environment variable

  1. Display the address of the Kubernetes master node of your OpenShift Container Platform cluster:

    oc cluster-info
    Kubernetes master is running at <master_address>
    # ...

    Copy the address of the master node.

  2. List all Operator subscriptions:

    oc get subs -n <operator_namespace>
  3. Edit the Subscription resource for AMQ Streams:

    oc edit sub amq-streams -n <operator_namespace>
  4. In spec.config.env, add the KUBERNETES_MASTER environment variable, set to the address of the Kubernetes master node. For example:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: amq-streams
      namespace: <operator_namespace>
    spec:
      channel: amq-streams-1.8.x
      installPlanApproval: Automatic
      name: amq-streams
      source: mirror-amq-streams
      sourceNamespace: openshift-marketplace
      config:
        env:
        - name: KUBERNETES_MASTER
          value: MASTER-ADDRESS
  5. Save and exit the editor.
  6. Check that the Subscription was updated:

    oc get sub amq-streams -n <operator_namespace>
  7. Check that the Cluster Operator Deployment was updated to use the new environment variable:

    oc get deployment <cluster_operator_deployment_name>

Workaround two: Disable hostname verification

  1. List all Operator subscriptions:

    oc get subs -n <operator_namespace>
  2. Edit the Subscription resource for AMQ Streams:

    oc edit sub amq-streams -n <operator_namespace>
  3. In spec.config.env, add the KUBERNETES_DISABLE_HOSTNAME_VERIFICATION environment variable, set to true. For example:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: amq-streams
      namespace: <operator_namespace>
    spec:
      channel: amq-streams-1.8.x
      installPlanApproval: Automatic
      name: amq-streams
      source: mirror-amq-streams
      sourceNamespace: openshift-marketplace
      config:
        env:
        - name: KUBERNETES_DISABLE_HOSTNAME_VERIFICATION
          value: "true"
  4. Save and exit the editor.
  5. Check that the Subscription was updated:

    oc get sub amq-streams -n <operator_namespace>
  6. Check that the Cluster Operator Deployment was updated to use the new environment variable:

    oc get deployment <cluster_operator_deployment_name>

7.2. Cruise Control CPU utilization estimation

Cruise Control for AMQ Streams has a known issue that relates to the calculation of CPU utilization estimation. CPU utilization is calculated as a percentage of the defined capacity of a broker pod. The issue occurs when running Kafka brokers across nodes with varying CPU cores. For example, node1 might have 2 CPU cores and node2 might have 4 CPU cores. In this situation, Cruise Control can underestimate and overestimate CPU load of brokers The issue can prevent cluster rebalances when the pod is under heavy load.

Workaround

There are two workarounds for this issue.

Workaround one: Equal CPU requests and limits

You can set CPU requests equal to CPU limits in Kafka.spec.kafka.resources. That way, all CPU resources are reserved upfront and are always available. This configuration allows Cruise Control to properly evaluate the CPU utilization when preparing the rebalance proposals based on CPU goals.

Workaround two: Exclude CPU goals

You can exclude CPU goals from the hard and default goals specified in the Cruise Control configuration.

Example Cruise Control configuration without CPU goals

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    # ...
  zookeeper:
    # ...
  entityOperator:
    topicOperator: {}
    userOperator: {}
  cruiseControl:
    brokerCapacity:
      inboundNetwork: 10000KB/s
      outboundNetwork: 10000KB/s
    config:
      hard.goals: >
        com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.MinTopicLeadersPerBrokerGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal
      default.goals: >
        com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.MinTopicLeadersPerBrokerGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal,
        com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal

For more information, see Insufficient CPU capacity.

7.3. User Operator scalability

The User Operator can timeout when creating multiple users at the same time. Reconciliation can take too long.

Workaround

If you encounter this issue, reduce the number of users you are creating at the same time. And wait until they are ready before creating more users.