Insights Segments & Datasources Dashboard

In LogScale there is this concept of Segments and Datasources.

A Segment is a file stored on a LogScale node that contains the compressed ingested data. Segment files sit under the directories of LogScale Internal Architecture.

LogScale Internal Architecture are part of repositories and are essentially split up into the combinations of repositories and Event Tags. This is where the stream of data for each repository and tag will sit under.

This dashboard will show you information regarding your clusters segments and datasources.

Segment Merges Per Hour

In LogScale, as ingest is coming in LogScale is building up these segment files and then storing them for use in querying later. As these segment files get build it first builds mini-segment to then be merged later into the segment file.

This widget shows the CPU time spent per LogScale host per hour on merging these segment files.

Merged Segments Sizes (Bytes)

This is median, 75th and 95th percentile of the size of the file for segments created by merging mini-segments.

On average this value will probably be around 50-500MB. Keeping this value below 1GB is a sign of a healthy cluster. Anything above that may be concerning so LogScale Support can help here.

Number Of Datasources In Repositories

So a LogScale Internal Architecture is this combination of repositories and the tags created within that repositories. This widget shows the number of datasources per repository which essentially means the total number of Event Tags values in each repositories.

There is a hard Limits & Standards in LogScale of 10k datasources to keep LogScale efficient and almost all cases you shouldn't need to go above this limit. If you exceed this limit then ingest will be blocked for that repository to stop it creating too many datasources.

The number of max datasources in a LogScale cluster is controlled by the MAX_DATASOURCES configuration.

For a healthy system, repository tags are unlikely to exceed 100 for a repository and if only absolutely necessary.

Global Snapshot File Size

In LogScale we have this file called the global-data-snapshot.json also known as Global Snapshot which essentially holds all of the key information about the LogScale cluster and is constantly updated across all nodes. It is where all metadata on repositories, users and all the other objects you can create through the UI is stored. It also holds the metadata on the "segment files" that holds the events shipped to LogScale.

This "Global" file is handled by LogScale and should be kept as small as possible to keep performance fast within LogScale. The size of the Global snapshot file is dependent on a number of different factors, including the number of tags and datasource combinations.

A larger global snapshot is not necessarily considered unhealthy and should be evaluated case by case.

The size of the global snapshot also affects memory usage, and will require upload time to S3 and may affect cluster startup times. The snapshot is compared with the local version when the cluster starts. If memory usage and the snapshot size are a concern, please reach out to Support.