Kafka Usage

LogScale uses Apache Kafka internally for queuing incoming messages and for storing shared state when running LogScale in a cluster setup. This page describes how LogScale uses Kafka.

LogScale creates the following queues in Kafka:

  • global-events

    This is LogScale's event-sourced database queue and it allows LogScale to communicate key information about configuration between nodes, where the information is ultimately stored in the Global Database.

    The queue has the following characteristics:

    • Relatively low throughput

    • No log data is saved to this queue

    • The queue should be configured with a high number of replicas to ensure the LogScale cluster state information is not lost.

  • humio-ingest

    Ingested events, i.e. user log data, is sent to this queue, after the Ingestion: Ingest Phase and before the data is stored during the Ingestion: Digest Phase.

    LogScale's ingest process accepts the incoming ingest requests, parse them, and put them on the queue. LogScale's digest process takes events from the queue and stores them into the datastore.

    This Kafka queue will have high throughput corresponding to the ingest load. The number of replicas can be configured in accordance with data size, latency and throughput requirements, and how important it is not to lose in-flight data.

  • transientChatter-events

    This queue is used for chatter between LogScale nodes. It is only used for transient data. LogScale will raise the number of replicas on this queue to 3 if there are at least three brokers in the Kafka cluster. The queue can have a short retention and it is not important to keep the data, as it gets stale very fast.