Logical Node Roles
LogScale operates a cluster with one or more physical nodes, with the recommended minimum node count of three for resilience.
All nodes in the cluster run the same software that can be configured to assume a combination of different logical roles:
Table: Node Role Feature Matrix
Feature |
all
|
httponly
|
ingestonly
|
---|---|---|---|
Accept ingest request on HTTP | Yes | Yes | Yes |
Accept ingest request on TCP/UDP | Yes | Yes | Yes |
Accept full set of HTTP API requests | Yes | Yes | No |
Accept file uploads | Yes | Yes | No |
Can coordinate queries | Yes | Yes | No |
Can coordinate alerts | Yes | Yes | No |
Visible in cluster management UI | Yes | Yes | No |
Requires significant local storage | Yes | No | No |
Can store segment files (events) | Yes | No | No |
Can upload segment files to bucket | Yes | No | No |
Executes queries on events | Yes | No | No |
Can be considered stateless | No | No | Yes |
A node can have any combination of the four roles, and all play a part in LogScale's data ingest flow.
The use of specific node roles can allow you to better tune cluster performance and cost. See Node Role Examples for information on how this works in practice.
Ingest Node
An ingest node is a node that is responsible for servicing:
Ingest-only parts of the HTTP API
TCP and UDP ingest listeners
Parsing incoming data
The ingest node receives data from external systems, parses it into Events, and passes the data on to the Digest Node.
A node can be configured as a stateless ingest-only node by adding
NODE_ROLES=ingestonly
to the configuration.
In order to remain stateless, a node in this role does not join the cluster as seen from the rest of the cluster. It does not show up in the cluster management UI and it does not get a node ID. This means that TCP/UDP ingest listeners that need to run on these nodes must be configured to run on all nodes, not tied to a specific node.
HTTP API Node
An HTTP API node is a node that is responsible for servicing:
The Web UI
The full HTTP API
TCP and UDP ingest listeners
Parsing incoming data
The HTTP API node handles all types of the HTTP requests, including those of the ingest node. An HTTP API node is visible in the cluster management user interface. It uses storage for lookup files, global snapshots, and caching of query states. Local storage such as SSD is ideal, but if not available, remote mounted block storage can be used, and for an HTTP-only node is typically sized at 100GB to 1TB.
A node can be configured as an HTTP API node by adding
NODE_ROLES=httponly
to the configuration of the node.
Digest Node
A digest node is responsible for:
Constructing segment files for incoming events (the internal storage format in LogScale)
Executing the real-time part of searches
Executing the historical part of searches on recent events (older events are handled by the storage nodes)
Once a segment file is completed it is passed on to storage nodes.
Digest nodes are designated by adding them to the cluster's Digest Rules. Any node that appears in the digest rules is a Digest Node.
A digest node must have the NODE_ROLES=all
in the
configuration of the node, but as that is the default value, leaving
it out works too.
Storage Node
A storage node is a node that saves data to disk. Storage nodes are responsible for:
Storing events (Segment files constructed by Digest Rules)
Executing the historical part of searches (the most recent results are handled by digest nodes)
The data directory of a storage node is used to store the segment files. Segment files make up for the bulk of all data in LogScale.
Storage nodes are configured using the cluster's Storage Rules. Any node that appears in the storage rules is considered a Storage Node. A node that was previously in a storage rule can still contain segment files that are used for querying.
The Storage Rules are used to configure data Replication Factor.
A storage node must have the NODE_ROLES=all
in the
configuration of the node (as this is the default value, leaving it
out works too).