Architecture of LogScale
Part of our Foundational Concepts series:
Back to Dashboards
Next Concept: Cluster Nodes
LogScale is build from the ground up as a distributed system with high availability and horizontal scaling as design goals, targeting hosting on either physical nodes with local disks or virtual machines in the cloud with ephemeral local disks.
To help you plan and manage your LogScale installation we include this overview of the internal architecture of the LogScale server.
LogScale relies on Kafka for internal messaging and for temporary storage of work in progress. LogScale can use most Kafka installations and is not tied to using the Kafka version or build that is shipped with LogScale. It's also possible to configure LogScale to not manage the topics but rely on the operator to do so.
The LogScale UI is a web application accessing the server via HTTP.
Internal Components
LogScale contains a number of internal components that manage the internal structure, data, and systems that enable the rest of the LogScale service to function.
Global – The metadata database
Global is an event-sourced in-memory shared state across the cluster
implemented on top of the
global-events
topic in Kafka.
This in-memory state receives changes by subscribing to the topic and
is not modified by any other source. To avoid having to retain all
events in that topic the nodes in the LogScale cluster periodically
writes a snapshot (the
global-snapshot.json
file) of
the current state to their local file system which allows them to only
pull changes more recent that the snapshot at startup.
Global is where all metadata on repositories, users and all the other objects you can create through the UI is stored. It also holds the metadata on the segment files that holds the events shipped to LogScale, but no events as such.
HTTP API
HTTP is applied mainly to three tasks: The LogScale UI has HTTP access only, Ingesting events over HTTP, and for cluster-internal requests.
The UI connects with the LogScale server through HTTP API that consists mainly of the GraphQL endpoint but also has a few REST endpoints.
The HTTP API interacts with the Global state, extracting subsets for
the display in the UI and publishing changes to the
global-events
topic.
The ingest endpoints in the HTTP API runs the desired parser for the
incoming events (the parsers are in Global) and publishes the
resulting events to partitions the
humio-ingest
topic in Kafka.
Only after getting an acknowledgment from Kafka is the HTTP response
200 OK
sent to the HTTP client.
The cluster-internal requests include copying of segment files between nodes, submitting and polling state of pending searches.
Digest Engine
The digest engine builds compressed mini segment files from ingested
events from the humio-ingest
topic from Kafka.
A node runs the digest engine only if digest partitions are assigned to the node. The digest engine is active only if the node is currently the primary node on some partition, or all nodes listed with higher priority are currently off-line as decided by the cluster coordination component.
The Digest engine closes mini segment files after at most 30 minutes or less to limit the amount of work in progress that would be lost if the node crashed. The mini segments are compressed using a fast compression algorithm. After 24 hours the Digest engine will target a fresh merge target for the next set of mini segments.
Segment File Merger
The Segment file merger monitors the set of mini segments and merge targets and once all mini segments for a merge target exist, merges all the mini segments into a completed segment file, applying a high compression level that results in typically twice the compression effect compared to the mini segments. The nodes that should hold the completed segment is selected using the storage partition assignments.
Segment Mover
When a digest partition is assigned more than one node then the nodes not currently acting as the primary on that partition will fetch completed mini segment files from the primary via the internal HTTP API. The completed segments are also transferred by this internal task.
Bucket Storage
The Bucket Storage component manages upload and download of segment files and global snapshot files to/from persistent buckets such as AWS S3. This is an optional feature that provides a backup of the data that LogScale can apply to restore all segment files stored on any node in the cluster in case of loss of a node.
Cluster Coordination Component
All nodes monitor the aliveness of the others through their messaging
on the global-events
topic. If a
node does not ping that topic for a while then it is deemed off-line
by the others and the nodes that are currently leaders of ingest push
messages to the global topic setting new primaries on the digest
partitions where the leader was deemed off-line.
Search Engine
The search engine is the component that accepts an input string ("the query") from a user and orchestrates the execution across the cluster of that search.
When a search is submitted to a LogScale server node through the HTTP API that node uses the "Global" metadata to determine which other nodes need to participate in executing the search and on what subset of the set of events each other node needs to work. It then submits (via HTTP on the internal API) a request to each of the other nodes launching a search of that subset, and starts retrieving and merging the results from those nodes as they become available, making them available to the client that submitted the search.
High-Availability (HA) Implementation Architecture of LogScale
LogScale uses a combination of replicated files and data in Kafka topics to make sure to not lose any data. LogScale relies on Kafka as the commit-log for recent and in-progress work, and local file systems (or Bucket Storage) as the trusted persistent long term storage.
When events or requests arrive on the HTTP API the
200 OK
message is issued only
after Kafka has acknowledged all writes that stem from them. This
delegates the responsibility of not losing incoming changes to Kafka
which is a proven solution for this task.
When the digest nodes construct segment files from incoming events they include the offsets from the Kafka partition the events went into as part of the meta data in Global. Segment files then get replicated to other nodes in the cluster. The cluster calculates the offset for each partition where all events are in files that are properly replicated and then tells Kafka that it is okay for Kafka to delete events older than that.
LogScale can be configured to be resilient against data loss in more than one way depending on the hosting environment.
In all cases the hosting environment must ensure that Kafka has sufficient replicas and fail-over nodes to support the desired resilience.
Without Bucket Storage
LogScale is able to retain all ingested events in the case of a single node loss if all digest and storage partitions are assigned 2 or more nodes. Events may be deleted from Kafka once the resulting segment files have been copied to 2 nodes.
Using Bucket Storage
With bucket storage enabled on top of ephemeral disks LogScale is able to retain all ingested events even when only 1 single node is assigned to every partition: LogScale deletes ingested events from Kafka only once the resulting segment files have been uploaded to the bucket and the remote checksum of the uploaded file has been verified.
Ingest
Ingest can run as long as any LogScale node is reachable and provided that node is able to publish the incoming events to Kafka.
Digest and Search
Digest runs as long as the cluster coordinator is able to assign nodes to all digest partitions.
Searches that depend on recent data (live searches) restart to reflect changes in the active digester set.
With the default snapshot interval of 30 minutes for mini segment files, every change to the set of live digest nodes may require replaying 30 minutes of event data from Kafka. This implies that changes to the live set at shorter intervals may stall the system.
Digest and Search using Bucket Storage
Segment files in the bucket are available to any host in the cluster and the cluster can thus continue to operate as long as a sufficiently large subset of the nodes with digest partitions assigned are alive to manage the load.
Digest and Search without Bucket Storage
The segment files are available only on the nodes assigned in the partition tables for digest and storage. Some event data may become temporarily unavailable if all nodes on any partition are missing at the same time.
Part of our Foundational Concepts series:
Back to Dashboards
Next Concept: Cluster Nodes