S3 Archiving (Cloud)

LogScale supports archiving ingested logs to Amazon S3. The archived logs are then available for further processing in any external system that integrates with S3. The files written by LogScale in this format are not searchable by LogScale — this is an export meant for other systems to consume.
Archiving works by running a periodic job inside all LogScale nodes, which looks for new, unarchived segment files. The segment files are read from disk, streamed to an S3 bucket, and marked as archived in LogScale.
An admin user needs to set up archiving per repository. After selecting a repository on the LogScale front page, the configuration page is available under Settings.
Note
For slow-moving datasources it can take some time before segment files are completed on disk and then made available for the archiving job. In the worst case, before a segment file is completed, it must contain a gigabyte of uncompressed data or 30 minutes must have passed. The exact thresholds are those configured as the limits on mini segments.
Important
S3 archiving is not supported for S3 buckets where object locking is enabled.
For more information on segments files and datasources, see Ingest Flow and Data Sources
S3 Layout (Cloud-hosted LogScale)
When uploading a segment file, LogScale creates the S3 object key based on the tags, start date, and repository name of the segment file. The resulting object key makes the archived data browseable through the S3 management console.
LogScale uses the following pattern:
REPOSITORY/TYPE/TAG_KEY_1/TAG_VALUE_1/../TAG_KEY_N/TAG_VALUE_N/YEAR/MONTH/DAY/START_TIME-SEGMENT_ID.gz
Read more about Event Tags.
Format
The default archiving format is NDJSON and optionally raw log lines. When using NDJSON, the parsed fields will be available along with the raw log line. This incurs some extra storage cost compared to using raw log lines but gives the benefit of ease of use when processing the logs in an external system.
Cloud setup
Enabling Cloud-hosted LogScale to write to your S3 bucket means setting up AWS cross-account access.
In AWS:
Log in to the AWS console and navigate to your S3 service page.
Click the name of the bucket where archived logs should be written.
Note
Please follow the instructions on the AWS docs on naming conventions. In particular, using dashes not periods as a separator, and ensuring you do not repeat dashes and dots.
Click
Scroll down to the Access Contol Lists and click
Scroll down and click the
buttonEnter the canonical ID for LogScale:
logscale
f2631ff87719416ac74f8d9d9a88b3e3b67dc4e7b1108902199dea13da892780
Give the Grantee Read and Write access on the Bucket ACL.
Additionally give it Write on the objects.
In LogScale:
Go to the repository you want to archive and select
Settings --> S3 Archiving
.Configure by giving the bucket name, region, and then Save.
Tag Grouping
If tag grouping is defined for a repository, the segment files will be split by each unique combination of tags present in a file. This results in a file in S3 per each unique combination of tags. The same layout pattern is used as in the normal case. The reason for doing this is to make it easier for a human operator to determine whether a log file is relevant.