Insights Segments & Datasources Dashboard

In LogScale there is this concept of Segments and Datasources.

A Segment is a file stored on a LogScale node that contains the compressed ingested data. Segment files sit under the directories of Data Sources.

Data Sources are part of repositories and are essentially split up into the combinations of repositories and Event Tags. This is where the stream of data for each repository and tag will sit under.

This dashboard will show you information regarding your clusters segments and datasources.

Segment Merges Per Hour

In LogScale, as ingest is coming in LogScale is building up these segment files and then storing them for use in querying later. As these segment files get build it first builds mini-segment to then be merged later into the segment file.

This widget shows the CPU time spent per LogScale host per hour on merging these segment files.

Merged Segments Sizes (Bytes)

This is median, 75th and 95th percentile of the size of the file for segments created by merging mini-segments.

On average this value will probably be around 50-500MB. Keeping this value below 1GB is a sign of a healthy cluster. Anything above that may be concerning so LogScale Support can help here.

Number Of Datasources In Repositories

So a Data Sources is this combination of repositories and the tags created within that repositories. This widget shows the number of datasources per repository which essentially means the total number of Event Tags values in each repositories.

There is a hard Limits in LogScale of 10k datasources to keep LogScale efficient and almost all cases you shouldn't need to go above this limit. If you exceed this limit then ingest will be blocked for that repository to stop it creating too many datasources.

The number of max datasources in a LogScale cluster is controlled by the MAX_DATASOURCES configuration.

, For a healthy system I would not expect repository tags to exceed 100 for a repository and if only absolutely necessary.

Global Snapshot File Size

In LogScale we have this file called the global-data-snapshot.json also known as Architecture of Humio which essentially holds all of the key information about the LogScale cluster and is constantly updated across all nodes. It is where all metadata on repositories, users and all the other objects you can create through the UI is stored. It also holds the metadata on the "segment files" that holds the events shipped to LogScale.

This "Global" file is handled by LogScale and should be kept as small as possible to keep performance fast within LogScale. A healthy system should not see the Global snapshot file exceed past 1GB and is ideally <500MB. If this is not the case this should be discussed with LogScale support.