Switch Kafka and ZooKeeper

There are three options for switching Kafka and ZooKeeper:

Set up a new Kafka/ZooKeeper cluster and re-configure LogScale to use the new cluster.
Delete the Kafka and ZooKeeper data.
Create new Kafka queues and topics with new names in the Kafka cluster.

The first method is the simplest.

Delete Kafka & ZooKeeper Data and Re-Use Cluster

You could reset the current Kafka and ZooKeeper cluster by deleting their data directories on the filesystem on each node. This is the same as starting up a new and empty Kafka/ZooKeeper cluster. To do this, delete everything inside Kafka's data directory. Then delete the folder version-2 inside ZooKeeper's data directory.

It's important that you not delete the ZooKeeper file myid in ZooKeeper's data directory. The myid is a configuration file that contains the id of the ZooKeeper node in the ZooKeeper cluster and must be there at startup.

After doing this, you will have created completely new ZooKeeper and Kafka clusters.

Create New Kafka Queues/Topics with New Names in Kafka Cluster

Instead of resetting Kafka as described above, you can let LogScale use a new set of queues in the existing Kafka cluster. When reusing the same Kafka cluster, LogScale must be configured with a new HUMIO_KAFKA_TOPIC_PREFIX to detect the changes.

To make this process easier, the ALLOW_KAFKA_RESET_UNTIL_TIMESTAMP_MS variable can be used to allow kafka resets until a specific timestamp. This allows a reset of the Kafka configuration, with the default (disabled) setting being enabled when the timestamp is reached. It provides a six hour window for the reset to occur. To use, set the environment variable to a time in the future and then follow the instructions in Switching Kafka to reset the Kafka configuration.

It's important to note that it will not work to delete and recreate topics with the same names. In that case LogScale cannot detect the Kafka switch. If Kafka is managed by LogScale (KAFKA_MANAGED_BY_HUMIO), the new topics will be created automatically when LogScale starts up. Otherwise you must create topics externally before you start LogScale.

Restarting Kafka, ZooKeeper and LogScale

Now you're ready to get the Kafka/ZooKeeper cluster started. This is typically done by starting the ZooKeeper nodes. Wait for all nodes to be running and verify the ZooKeeper cluster. Then start all Kafka nodes, wait for them to be running and verify the Kafka cluster.

Once Kafka and ZooKeeper have started, start the LogScale nodes. It's important to start one LogScale node first. This node will detect the Kafka switch and create a new epoch in LogScale.

To verify that the Kafka switch was detected and handled, look for this line in the LogScale debug log:

Switching epoch to=${epochKey} from=${latestEpoch.kafkaClusterId}
- I'm the first cluster member to get here
  for this kafka. newEpoch=${newEpoch}

When the first node is up and running and the above logline confirms a new epoch has been created, the rest of the LogScale nodes can be started.

At that point, the LogScale cluster should be running again. Check the cluster nodes in the administrative section of the LogScale user interface: http://$HUMIOHOST/system/administration/partitions/ingest

Self-Hosted Overview

Instance Administration

Organization Essentials

Configuring Security

Authentication & Identity Providers

Users & permissions

Cluster Management

Configuration Settings

Ingesting Data

Configuration Variables

LogScale URLs & Endpoints

Limits & Standards

Deployment Overview

Planning Your Deployment

Provisioning

Installing Using Containers

Installing On Bare Metal or Cloud Instance

Reference Architectures

LogScale Kubernetes Reference Architecture

Installing Load Balancers

Deploying Auxiliary Services

Humio Operator

Data Analysis Overview

LogScale User Interface

Repositories & Views

Parsing Data

Searching Data

Writing Queries

Dashboards & Widgets

Automation

Query Language Syntax

Query Functions

Template Language

Keyboard Shortcuts