Ingesting Data

Ingesting data within LogScale is a multi-stage process designed to efficiently ingest data at speed without blocking a client from sending information. This ensures that each client is able to send the data as quickly as possible, and that LogScale is able to queue and store the data ready to be queried.

The basic process is shown in the diagram below;

graph LR; API["API"] IN["Ingest"] K(Kafka Queue) DN["Digest"] SN["Store"] API --> IN --> K --> DN --> SN click API "training-arch-op-ingest-api" "APIs and Log Shippers" click API "training-arch-op-ingest-ingest" "Ingestion" click API "training-arch-op-ingest-kafka" "Kafka" click API "training-arch-op-ingest-digest" "Digest" click API "training-arch-op-ingest-store" "Storage"

Each stage is a critical part of the ingestion and storage process:

  1. Ingestion: API Phase

    Log shippers, including the Falcon Log Collector, send data to LogScale through an API.

  2. Ingestion: Ingest Phase

    Incoming data is ingested and processed so that the information is suitable for storing in LogScale. This includes parsing the incoming events into individual fields and setting tags and datasources.

  3. Ingestion: Kafka Phase

    As incoming data has been received and parsed, the last step done by the ingestion layer is to put the data on a Kafka queue. The Kafka queue acts as the durable storage for the incoming data before it can be digested and stored by LogScale.

  4. Ingestion: Digest Phase

    The digest process handles live queries and translates the incoming stream of events into the stored, on disk segments that LogScale will use to store and query data.

  5. Ingestion: Storage Phase

    Store the information onto a storage. Two types of storage are used, fast local storage (for example local SSD or NVME) and object-based bucket storage such as that available from Amazon S3 or Google Cloud Storage.

To dig deeper into the specifics of each process use the links above.