Archive Data

Security Requirements and Controls

supports archiving ingested logs to Amazon S3 and Google Cloud Storage. Google Cloud Storage is only available for self-hosted installations. The archived logs are then available for further processing in any external system that integrates with the archiving provider. The files written by in this format are not searchable by — this is an export meant for other systems to consume.

For information about using S3 or Google Cloud Storage as storage for segments in a format that can read, see Bucket Storage.

When archiving is enabled all the events in repository are backfilled into the archiving platform and then it archives new events by running a periodic job inside all nodes, which looks for new, unarchived segment files. The segment files are read from disk, streamed to a bucket in the archiving provider's platform, and marked as archived in .

An administrator must configure for the cloud provider and set up archiving per repository. The cloud providers are:

After configuring the cloud provider and selecting a repository on , the configuration page is available under Settings.

Important

Until archiving has been configured at the cluster level, no archiving can be configured at the org level within the user interface.

Note

For slow-moving datasources it can take some time before segment files are completed on disk and then made available for the archiving job. In the worst case, before a segment file is completed, it must contain a gigabyte of uncompressed data or 30 minutes must have passed. The exact thresholds are those configured as the limits on mini segments.

For more information on segments files and datasources, see segment files and Datasources.