Strimzi Cluster Operator failed to connect to zookeeper
Issue
Even on a fresh installed kafka cluster, using redhat shipped examples, the strimzi-cluster-operator can't talk to the zookeeper pods on :2181. This doesn't appear to be an issue until the zookeeper spec is updated which triggers the refresh and rolling restart.
[root@test cluster-operator]# oc edit Kafka my-cluster -n cg
change
zookeeper:
livenessProbe:
initialDelaySeconds: 120
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 120
timeoutSeconds: 5
replicas: 3
storage:
type: ephemeral
TO:
livenessProbe:
initialDelaySeconds: 60
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 60
timeoutSeconds: 5
[root@test cluster-operator]# oc edit Kafka my-cluster -n cg
kafka.kafka.strimzi.io/my-cluster edited
... strimzi operator;
2019-05-15 03:37:20 INFO AbstractAssemblyOperator:167 - Reconciliation #9(timer) Kafka(cg/my-cluster): Assembly my-cluster should be created or updated
2019-05-15 03:37:20 INFO AbstractAssemblyOperator:312 - Reconciliation #9(timer) Kafka(cg/my-cluster): Assembly reconciled
2019-05-15 03:38:12 INFO AbstractAssemblyOperator:281 - Reconciliation #10(watch) Kafka(cg/my-cluster): Kafka my-cluster in namespace cg was MODIFIED
2019-05-15 03:38:12 INFO AbstractAssemblyOperator:167 - Reconciliation #10(watch) Kafka(cg/my-cluster): Assembly my-cluster should be created or updated
2019-05-15 03:38:12 INFO ZookeeperLeaderFinder:90 - Trusting certificate ca.crt from Secret my-cluster-cluster-ca-cert
2019-05-15 03:38:23 WARN ZookeeperLeaderFinder:253 - ZK my-cluster-zookeeper-0.my-cluster-zookeeper-nodes.cg.svc.cluster.local:2181: failed to connect to zookeeper:
2019-05-15 03:38:23 INFO ZookeeperLeaderFinder:192 - No leader found for cluster my-cluster in namespace cg; backing off for 0ms (cumulative 0ms)
2019-05-15 03:38:33 WARN ZookeeperLeaderFinder:253 - ZK my-cluster-zookeeper-0.my-cluster-zookeeper-nodes.cg.svc.cluster.local:2181: failed to connect to zookeeper:
2019-05-15 03:38:33 INFO ZookeeperLeaderFinder:192 - No leader found for cluster my-cluster in namespace cg; backing off for 5000ms (cumulative 5000ms)
And if using the nc tool, the ZK pod can not be connected too,
[root@ndccsi-sesosm01 cluster-operator]# oc rsh strimzi-cluster-operator-6bccf4d586-n9xx8
sh-4.2$ nc -v my-cluster-zookeeper-0.my-cluster-zookeeper-client.cg.svc.cluster.local 2181
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection timed out.
sh-4.2$ nc -v my-cluster-zookeeper-client 2181
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection timed out.
sh-4.2$ nc -v my-cluster-zookeeper-nodes 2181
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection to XX.XX.XX.XX failed: Connection timed out.
Ncat: Trying next address...
Ncat: Connection to XX.XX.XX.XX failed: Connection timed out.
Ncat: Trying next address...
Ncat: Connection timed out.
sh-4.2$ nc -v my-cluster-kafka-bootstrap 9092
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to XX.XX.XX.XX:9092.
^C
sh-4.2$ nc -v my-cluster-kafka-bootstrap 9093
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to XX.XX.XX.XX:9093.
^C
sh-4.2$ nc -v my-cluster-kafka-brokers 9092
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to XX.XX.XX.XX:9092.
Environment
- Red Hat AMQ Streams 1.1.0
- OpenShift Container Platform 3.11
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.