S3 Archiving


Humio supports archiving ingested logs to Amazon S3. The archived logs are then available for further processing in any external system that integrates with S3. The files written by Humio in this format are not searchable by Humio—this is an export meant for other systems to consume. See Bucket Storage Amazon or Bucket Storage Google Cloud for using S3 as storage for segments in a format that Humio can read.

Archiving works by running a periodic job inside all Humio nodes, which looks for new, unarchived segment files. The segment files are read from disk, streamed to an S3 bucket, and marked as archived in Humio.

An admin user needs to set up archiving per repository. After selecting a repository on the Humio front page, the configuration page is available under Settings.


For slow-moving datasources it can take some time before segment files are completed on disk and then made available for the archiving job. In the worst case, before a segment file is completed, it must contain a gigabyte of uncompressed data or 30 minutes must have passed. The exact thresholds are those configured as the limits on mini segments.

For more information on segments files and datasources, see segments files and datasources.

S3 Layout

When uploading a segment file, Humio creates the S3 object key based on the tags, start date, and repository name of the segment file. The resulting object key makes the archived data browseable through the S3 management console.

Humio uses the following pattern:


Read more about Tags.


The default archiving format is NDJSON and optionally raw log lines. When using NDJSON, the parsed fields will be available along with the raw log line. This incurs some extra storage cost compared to using raw log lines but gives the benefit of ease of use when processing the logs in an external system.


For a self-hosted installation of Humio, you need an IAM user with write access to the buckets used for archiving. That user must have programmatic access to S3, so when adding a new user through the AWS console make sure programmatic access is ticked:

Figure 1, Add User with Programmatic Access

Later in the process, you can retrieve the access key and secret key:

Figure 2, Access Key & Secret Key

This is needed in Humio in the following configuration:


The keys are used for authenticating the user against the S3 service. For more guidance on how to retrieve S3 access keys, see AWS access keys. For more details on creating a new user, see creating a new user in IAM.

Configuring the user to have write access to a bucket can be done by attaching a policy to the user.

IAM user example policy

The following JSON is an example policy configuration.

 "Version": "2012-10-17",
 "Statement": [
         "Effect": "Allow",
         "Action": [
         "Resource": [
         "Effect": "Allow",
         "Action": [
         "Resource": [

The policy can be used as an inline policy attached directly to the user through the AWS console:

Figure 3, Attach Inline Policy

Tag Grouping

If tag grouping is defined for a repository, the segment files will be split by each unique combination of tags present in a file. This results in a file in S3 per each unique combination of tags. The same layout pattern is used as in the normal case. The reason for doing this is to make it easier for a human operator to determine whether a log file is relevant.

Other Options

HTTP Proxy

If Humio is set up to use an HTTP proxy, it will per default be used for communicating with S3. It can be disabled using the following:

# Use the globally configured HTTP proxy for communicating with S3.
# Default is true.

Non-default endpoints

You can point to your own hosting endpoint for the S3 to use for archiving if you host an S3-compatible service like MinIO.


Virtual host style (default)

Humio will construct virtual host-style URLs like https://my-bucket.my-own-s3:8080/path/inside/bucket/file.txt.

For this style of access, you need to set your base URL, so it contains a placeholder for the bucket name.


Humio will replace the placeholder {bucket} with the relevant bucket name at runtime.


Some services do not support virtual host style access, and require path-style access. Such URLs have the format https://my-own-s3:8080/my-bucket/path/inside/bucket/file.txt. If you are using such a service, your endpoint base URL should not contain a bucket placeholder.


Additionally, you must set S3_ARCHIVING_PATH_STYLE_ACCESS to true

IBM Cloud Storage compatibility

S3 Archiving can be used with IBM Cloud Storage by setting S3_ARCHIVING_IBM_COMPAT to true.