LogScale on Bare Metal - Apache Kafka Installation using Kraft

To install Kafka using Kraft:

  1. Go to the /opt directory and download the latest release. The package can be downloaded using wget:

    shell
    $ cd /opt
    $ wget https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz
  2. Extract the archive and create directories it needs like this:

    shell
    $ tar zxf kafka_2.13-3.7.0.tgz
  3. Now create the directories where the information will be stored. We will use the top level directory /kafka since that could be a mount point for a separate filesystem. We will also create a directory for application log files in /var/log/kafka:

    shell
    $ mkdir /var/log/kafka
    $ mkdir /kafka/kafka
    $ chown kafka:kafka /kafka/kafka

    Now link the application directory to /opt/kafka which will allow us to use /opt/kafka for the application and scripts, but update the version by downloading and relinking to the updated application directory:

    shell
    $ ln -s /opt/kafka_2.13-3.7.0 /opt/kafka
  4. Using a text editor, open the Kafka properties file, server.properties, located in the kafka/config sub-directory. The following options are configured for using Kafka in Kraft mode. The hostnames and port numbers will be shared across each node. Configuration files for three different nodes are shown below:

    Node 1
    ini
    node.id=1
    controller.quorum.voters=1@kafka1:9093,2@kafka2:9093,3@kafka3:9093
    listeners=PLAINTEXT://kafka1:9092,CONTROLLER://kafka1:9093
    advertised.listeners=PLAINTEXT://kafka1:9092
    num.partitions=6
    logs.dir=/opt/kafka/kraft-combined-logs
    Node 2
    ini
    node.id=2
    controller.quorum.voters=1@kafka1:9093,2@kafka2:9093,3@kafka3:9093
    listeners=PLAINTEXT://kafka2:9092,CONTROLLER://kafka2:9093
    advertised.listeners=PLAINTEXT://kafka2:9092
    num.partitions=6
    logs.dir=/opt/kafak/kraft-combined-logs
    Node 3
    ini
    node.id=3
    controller.quorum.voters=1@kafka1:9093,2@kafka2:9093,3@kafka3:9093
    listeners=PLAINTEXT://kafka3:9092,CONTROLLER://kafka3:9093
    advertised.listeners=PLAINTEXT://kafka3:9092
    num.partitions=6
    logs.dir=/opt/kafka/kraft-combined-logs

    The first line sets the node.id value, a unique node must be set for each node. The controller.quarum.voters sets the nodes that will choose how work is distributed. The next two lines set the ports and protocol types. The logs.dir sets the location of data directories.

    Update the ownership of the directory where the logs are stored:

    shell
    $ chown -R kafka:kafka /opt/kafka

    Modify the directory according to the version of Kafka that has been installed.

  5. If deploying a multi-node Kafka cluster, make sure that each node can resolve the hostname of each other node in the cluster. One way to achieve this is to edit the /etc/hosts file on each node with the host information:

    ini
    192.168.1.15 kafka1
    192.168.1.16 kafka2
    192.168.1.17 kafka3

    Important

    Be aware that in some Linux distributions, the hosts file may contain a line that by default resolves the hostname to the localhost address, 127.0.0.1. This will cause servers to only listen on the localhost address and therefore not accessible to other hosts on the network. In this case, change the line to:

    ini
    127.0.1.1 kafka1 localhost

    Updating the IP address to the public address of the host.

  6. Now create a service file for Kafka.

    Create the file /etc/systemd/system/kafka.service sub-directory, edit the file add the following lines:

    ini
    [Unit]
    
    [Service]
    Type=simple
    User=kafka
    LimitNOFILE=800000
    Environment="LOG_DIR=/var/log/kafka"
    Environment="GC_LOG_ENABLED=true"
    Environment="KAFKA_HEAP_OPTS=-Xms512M -Xmx4G"
    ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
    Restart=on-failure
    TimeoutSec=900
    
    [Install]
    WantedBy=multi-user.target
  7. Now start the Kafka service:

    shell
    $ systemctl start kafka
    $ systemctl status kafka
    $ systemctl enable kafka

    These steps must be repeated on each host in a multi-node deployment.