Chapter 2. Getting started

AMQ Streams is distributed in a ZIP file. The ZIP files contain installation artifacts for the following components:

  • Apache Kafka
  • Apache ZooKeeper
  • Apache Kafka Connect
  • Apache Kafka MirrorMaker
Note

The Kafka Bridge has separate installation files. For information on installing and using the Kafka Bridge, see Using the AMQ Streams Kafka Bridge.

2.1. Installation environment

AMQ Streams runs in Red Hat Enterprise Linux. The host (node) can be a physical or virtual machine (VM). Use the installation files provided with AMQ Streams to install Kafka components. You can install Kafka in a single-node or multi-node environment.

Single-node environment
A single-node Kafka cluster runs one or more Kafka brokers and ZooKeeper instances on a single host. This configuration is not suitable for a production environment.
Multi-node environment
A multi-node Kafka cluster runs one or more Kafka brokers and ZooKeeper instances on a multiple hosts.

We recommended that you run Kafka and other Kafka components, such as Kafka Connect, on separate hosts. By running the components in this way, its easier to maintain and upgrade each component.

Kafka clients establish a connection to the Kafka cluster using the bootstrap.servers configuration property. If you are using Kafka Connect, for example, the Kafka Connect configuration properties must include a bootstrap.servers value that specifies the hostname and port of the hosts where the Kafka brokers are running. If the Kafka cluster is running on more than one host with multiple Kafka brokers, you specify a hostname and port for each broker. Each Kafka broker is identified by a broker.id.

2.1.1. Supported environment versions

AMQ Streams must be running in a supported version of Red Hat Enterprise Linux. The environment must also be using a supported JVM version. For more information, see Supported Configurations.

2.1.2. Data storage considerations

An efficient data storage infrastructure is essential to the optimal performance of AMQ Streams.

Block storage is required. File storage, such as NFS, does not work with Kafka.

Choose from one of the following options for your block storage:

  • Cloud-based block storage solutions, such as Amazon Elastic Block Store (EBS)
  • Local storage
  • Storage Area Network (SAN) volumes accessed by a protocol such as Fibre Channel or iSCSI

2.1.3. File systems

Kafka uses a file system for storing messages. AMQ Streams is compatible with the XFS and ext4 file systems, which are commonly used with Kafka. Consider the underlying architecture and requirements of your deployment when choosing and setting up your file system.

For more information, refer to Filesystem Selection in the Kafka documentation.

2.1.4. Apache Kafka and ZooKeeper storage

Use separate disks for Apache Kafka and ZooKeeper.

Kafka supports JBOD (Just a Bunch of Disks) storage, a data storage configuration of multiple disks or volumes. JBOD provides increased data storage for Kafka brokers. It can also improve performance.

Solid-state drives (SSDs), though not essential, can improve the performance of Kafka in large clusters where data is sent to and received from multiple topics asynchronously. SSDs are particularly effective with ZooKeeper, which requires fast, low latency data access.

Note

You do not need to provision replicated storage because Kafka and ZooKeeper both have built-in data replication.

Additional resources

2.2. Downloading AMQ Streams

A ZIP file distribution of AMQ Streams is available for download from the Red Hat website. You can download the latest version of Red Hat AMQ Streams from the AMQ Streams software downloads page.

  • For Kafka and other Kafka components, download the amq-streams-<version>-bin.zip file
  • For Kafka Bridge, download the amq-streams-<version>-bridge-bin.zip file.

    For installation instructions, see Using the AMQ Streams Kafka Bridge.

2.3. Installing Kafka

Use the AMQ Streams ZIP files to install Kafka on Red Hat Enterprise Linux. You can install Kafka in a single-node or multi-node environment. In this procedure, a single Kafka broker and ZooKeeper instance are installed on a single host (node).

The AMQ Streams installation files include the binaries for running other Kafka components, like Kafka Connect, Kafka MirrorMaker 2.0, and Kafka Bridge. In a single-node environment, you can run these components from the same host where you installed Kafka. However, we recommended that you add the installation files and run other Kafka components on separate hosts.

Note

If you are using a multi-node environment, you install Kafka brokers and ZooKeeper instances on more than one host. Repeat the installation steps for each host. To identify each ZooKeeper instance and broker, you add a unique ID in the configuration. For more information, see Chapter 3, Running a multi-node environment.

Prerequisites

Procedure

Install Kafka with ZooKeeper on your host.

  1. Add a new kafka user and group:

    groupadd kafka
    useradd -g kafka kafka
    passwd kafka
  2. Extract and move the contents of the amq-streams-<version>-bin.zip file into the /opt/kafka directory.

    unzip amq-streams-<version>-bin.zip -d /opt
    mv /opt/kafka*redhat* /opt/kafka

    This step requires admin privileges.

  3. Change the ownership of the /opt/kafka directory to the kafka user:

    chown -R kafka:kafka /opt/kafka
  4. Create directory /var/lib/zookeeper for storing ZooKeeper data and set its ownership to the kafka user:

    mkdir /var/lib/zookeeper
    chown -R kafka:kafka /var/lib/zookeeper
  5. Create directory /var/lib/kafka for storing Kafka data and set its ownership to the kafka user:

    mkdir /var/lib/kafka
    chown -R kafka:kafka /var/lib/kafka

    You can now run a default configuration of Kafka as a single-node cluster.

    You can also use the installation to run other Kafka components, like Kafka Connect, on the same host.

    To run other components, specify the hostname and port to connect to the Kafka broker using the bootstrap.servers property in the component configuration.

    Example bootstrap servers configuration pointing to a single Kafka broker on the same host

    bootstrap.servers=localhost:9092

    However, we recommend installing and running Kafka components on separate hosts.

  6. (Optional) Install Kafka components on separate hosts.

    1. Extract the installation files to the /opt/kafka directory on each host.
    2. Change the ownership of the /opt/kafka directory to the kafka user.
    3. Add bootstrap.servers configuration that connects the component to the host (or hosts in a multi-node environment) running the Kafka brokers.

      Example bootstrap servers configuration pointing to Kafka brokers on different hosts

      bootstrap.servers=kafka0.<host_ip_address>:9092,kafka1.<host_ip_address>:9092,kafka2.<host_ip_address>:9092

      You can use this configuration for Kafka Connect, MirrorMaker 2.0, and the Kafka Bridge.

2.4. Running a single-node Kafka cluster

This procedure shows how to run a basic AMQ Streams cluster consisting of a single Apache ZooKeeper node and a single Apache Kafka node, both running on the same host. The default configuration files are used for ZooKeeper and Kafka.

Warning

A single node AMQ Streams cluster does not provide reliability and high availability and is suitable only for development purposes.

Prerequisites

  • AMQ Streams is installed on the host

Running the cluster

  1. Edit the ZooKeeper configuration file /opt/kafka/config/zookeeper.properties. Set the dataDir option to /var/lib/zookeeper/:

    dataDir=/var/lib/zookeeper/
  2. Edit the Kafka configuration file /opt/kafka/config/server.properties. Set the log.dirs option to /var/lib/kafka/:

    log.dirs=/var/lib/kafka/
  3. Switch to the kafka user:

    su - kafka
  4. Start ZooKeeper:

    /opt/kafka/bin/zookeeper-server-start.sh -daemon /opt/kafka/config/zookeeper.properties
  5. Check that ZooKeeper is running:

    jcmd | grep zookeeper

    Returns:

    number org.apache.zookeeper.server.quorum.QuorumPeerMain /opt/kafka/config/zookeeper.properties
  6. Start Kafka:

    /opt/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
  7. Check that Kafka is running:

    jcmd | grep kafka

    Returns:

    number kafka.Kafka /opt/kafka/config/server.properties

2.5. Sending and receiving messages from a topic

This procedure describes how to start the Kafka console producer and consumer clients and use them to send and receive several messages.

A new topic is automatically created in step one. Topic auto-creation is controlled using the auto.create.topics.enable configuration property (set to true by default). Alternatively, you can configure and create topics before using the cluster. For more information, see Topics.

Procedure

  1. Start the Kafka console producer and configure it to send messages to a new topic:

    /opt/kafka/bin/kafka-console-producer.sh --broker-list <bootstrap_address> --topic <topic-name>

    For example:

    /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-topic
  2. Enter several messages into the console. Press Enter to send each individual message to your new topic:

    >message 1
    >message 2
    >message 3
    >message 4

    When Kafka creates a new topic automatically, you might receive a warning that the topic does not exist:

    WARN Error while fetching metadata with correlation id 39 :
    {4-3-16-topic1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)

    The warning should not reappear after you send further messages.

  3. In a new terminal window, start the Kafka console consumer and configure it to read messages from the beginning of your new topic.

    /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server <bootstrap_address> --topic <topic-name> --from-beginning

    For example:

    /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my-topic --from-beginning

    The incoming messages display in the consumer console.

  4. Switch to the producer console and send additional messages. Check that they display in the consumer console.
  5. Stop the Kafka console producer and then the consumer by pressing Ctrl+C.

2.6. Stopping the AMQ Streams services

You can stop the Kafka and ZooKeeper services by running a script. All connections to the Kafka and ZooKeeper services will be terminated.

Prerequisites

  • AMQ Streams is installed on the host
  • ZooKeeper and Kafka are up and running

Procedure

  1. Stop the Kafka broker.

    su - kafka
    /opt/kafka/bin/kafka-server-stop.sh
  2. Confirm that the Kafka broker is stopped.

    jcmd | grep kafka
  3. Stop ZooKeeper.

    su - kafka
    /opt/kafka/bin/zookeeper-server-stop.sh