Logical Node Roles

LogScale operates a cluster with one or more physical nodes, with the recommended minimum node count of three for resilience.

All nodes in the cluster run the same software that can be configured to assume a combination of different logical roles:

Table: Node Role Feature Matrix

Feature	`all`	`httponly`	`ingestonly`
Accept ingest request on HTTP	Yes	Yes	Yes
Accept ingest request on TCP/UDP	Yes	Yes	Yes
Accept full set of HTTP API requests	Yes	Yes	No
Accept file uploads	Yes	Yes	No
Can coordinate queries	Yes	Yes	No
Can coordinate alerts	Yes	Yes	No
Visible in cluster management UI	Yes	Yes	No
Requires significant local storage	Yes	No	No
Can store segment files (events)	Yes	No	No
Can upload segment files to bucket	Yes	No	No
Executes queries on events	Yes	No	No
Can be considered stateless	No	No	Yes

A node can have any combination of the four roles, and all play a part in LogScale's data ingest flow.

The use of specific node roles can allow you to better tune cluster performance and cost. See Node Role Examples for information on how this works in practice.

Ingest Node

An ingest node is a node that is responsible for servicing:

Ingest-only parts of the HTTP API
TCP and UDP ingest listeners
Parsing incoming data

The ingest node receives data from external systems, parses it into Events, and passes the data on to the Digest Node.

A node can be configured as a stateless ingest-only node by adding NODE_ROLES=ingestonly to the configuration.

In order to remain stateless, a node in this role does not join the cluster as seen from the rest of the cluster. It does not show up in the cluster management UI and it does not get a node ID. This means that TCP/UDP ingest listeners that need to run on these nodes must be configured to run on all nodes, not tied to a specific node.

HTTP API Node

An HTTP API node is a node that is responsible for servicing:

The Web UI
The full HTTP API
TCP and UDP ingest listeners
Parsing incoming data

The HTTP API node handles all types of the HTTP requests, including those of the ingest node. An HTTP API node is visible in the cluster management user interface. It uses storage for lookup files, global snapshots, and caching of query states. Local storage such as SSD is ideal, but if not available, remote mounted block storage can be used, and for an HTTP-only node is typically sized at 100GB to 1TB.

A node can be configured as an HTTP API node by adding NODE_ROLES=httponly to the configuration of the node.

Digest Node

A digest node is responsible for:

Constructing segment files for incoming events (the internal storage format in LogScale)
Executing the real-time part of searches
Executing the historical part of searches on recent events (older events are handled by the storage nodes)

Once a segment file is completed it is passed on to storage nodes.

Digest nodes are designated by adding them to the cluster's Digest Rules. Any node that appears in the digest rules is a Digest Node.

A digest node must have the NODE_ROLES=all in the configuration of the node, but as that is the default value, leaving it out works too.

Storage Node

A storage node is a node that saves data to disk. Storage nodes are responsible for:

Storing events (Segment files constructed by Digest Rules)
Executing the historical part of searches (the most recent results are handled by digest nodes)

The data directory of a storage node is used to store the segment files. Segment files make up for the bulk of all data in LogScale.

Storage nodes are configured using the cluster's Storage Rules. Any node that appears in the storage rules is considered a Storage Node. A node that was previously in a storage rule can still contain segment files that are used for querying.

The Storage Rules are used to configure data Replication Factor.

A storage node must have the NODE_ROLES=all in the configuration of the node (as this is the default value, leaving it out works too).