Logical Node Roles

LogScale operates a cluster with one or more physical nodes, with the recommended minimum node count of three for resilience.

All nodes in the cluster run the same software that can be configured to assume a combination of different logical roles:

Table: Node Role Feature Matrix

Feature all httponly ingestonly
Accept ingest request on HTTP Yes Yes Yes
Accept ingest request on TCP/UDP Yes Yes Yes
Accept full set of HTTP API requests Yes Yes No
Accept file uploads Yes Yes No
Can coordinate queries Yes Yes No
Can coordinate alerts Yes Yes No
Visible in cluster management UI Yes Yes No
Requires significant local storage Yes No No
Can store segment files (events) Yes No No
Can upload segment files to bucket Yes No No
Executes queries on events Yes No No
Can be considered stateless No No Yes

A node can have any combination of the four roles, and all play a part in LogScale's data ingest flow.

The use of specific node roles can allow you to better tune cluster performance and cost. See Node Role Examples for information on how this works in practice.

Ingest Node

An ingest node is a node that is responsible for servicing:

  • Ingest-only parts of the HTTP API

  • TCP and UDP ingest listeners

  • Parsing incoming data

The ingest node receives data from external systems, parses it into Events, and passes the data on to the Digest Node.

A node can be configured as a stateless ingest-only node by adding NODE_ROLES=ingestonly to the configuration.

In order to remain stateless, a node in this role does not join the cluster as seen from the rest of the cluster. It does not show up in the cluster management UI and it does not get a node ID. This means that TCP/UDP ingest listeners that need to run on these nodes must be configured to run on all nodes, not tied to a specific node.

HTTP API Node

An HTTP API node is a node that is responsible for servicing:

  • The Web UI

  • The full HTTP API

  • TCP and UDP ingest listeners

  • Parsing incoming data

The HTTP API node handles all types of HTTP requests, including those of the ingest node. An HTTP API node is visible in the cluster management user interface. It uses the local data directory for cache storage of files.

A node can be configured as an HTTP API node by adding NODE_ROLES=httponly to the configuration of the node.

Digest Node

A digest node is responsible for:

  • Constructing segment files for incoming events (the internal storage format in LogScale)

  • Executing the real-time part of searches

  • Executing the historical part of searches on recent events (older events are handled by the storage nodes)

Once a segment file is completed it is passed on to storage nodes.

Digest nodes are designated by adding them to the cluster's Digest Rules. Any node that appears in the digest rules is a Digest Node.

A digest node must have the NODE_ROLES=all in the configuration of the node, but as that is the default value, leaving it out works too.

Storage Node

A storage node is a node that saves data to disk. Storage nodes are responsible for:

  • Storing events (Segment files constructed by Digest Rules)

  • Executing the historical part of searches (the most recent results are handled by digest nodes)

The data directory of a storage node is used to store the segment files. Segment files make up for the bulk of all data in LogScale.

Storage nodes are configured using the cluster's Storage Rules. Any node that appears in the storage rules is considered a Storage Node. A node that was previously in a storage rule can still contain segment files that are used for querying.

The Storage Rules are used to configure data Replication Factor.

A storage node must have the NODE_ROLES=all in the configuration of the node (as this is the default value, leaving it out works too).