AWS Cloud Reference Deployment and Automation

This guide provides information on using the LogScale Reference Automations for AWS. The GitHub repository with the automations is available here: https://github.com/CrowdStrike/logscale-aws.

Note

It is recommended you follow the LogScale Kubernetes Reference Architecture guide and then decide how running LogScale fits into your infrastructure.

Diagram showing the typical architecture layout for an AWS deployment

Prerequisites

The following prerequisite components are required before installation:

Requirements

Before using this guide, the following requirements must be met:

CPU Architecture

As of now, Humio Operator only supports the x86_64/amd64 CPU architecture. Therefore, using Graviton CPU instances in AWS will not work for LogScale.

Bucket Storage

AWS provides NVMe storage in the form of local SSDs, which are directly attached to the virtual machine instances. Local SSDs offer high Input/Output Operations Per Second(IOPS) and low latency. When utilizing ephemeral instances bucket storage is required for a production environment as it acts as the persistent storage for the cluster.

Kubernetes

The minimum Kubernetes version supported by the Humio Operator, see Version Matrix.

Kafka

LogScale relies on Kafka as a fault tolerant event bus and internal cluster communication system. You must have an available Kafka cluster before deploying LogScale.

AWS MSK is a great option for this, and a provisioning can be done using the provided terraform components code.

TLS

By default the Humio Operator utilizes cert-manager to create an internal certificate authority for use by the LogScale cluster. In addition, support for provisioning certificates for external connectivity can be used in conjunction with cert-manager's external issuer support. If LogScale is configured to expose its APIs using HTTPS, which is the default, LogScale assumes Kafka connectivity will also utilize TLS, this is configurable. In some environments that employ service meshes that implement TLS or mTLS, TLS support can be disabled completely.

Nginx Ingress

The humio-operator contains one built-in ingress implementation, which relies on ingress-nginx to expose the cluster to the outside of the Kubernetes cluster. The built-in support for ingress-nginx should be seen mostly as a good starting point and source of inspiration, so if it does not match certain requirements it is possible to point alternative ingress controllers to the Service resource(s) pointing to the cluster pods. The built-in support for ingress-nginx only works if there is a single node pool with all nodes performing all tasks.

SSD RAID 0 setup

As long as we use Kafka (MSK) and bucket storage with our LogScale cluster, using RAID 0 on our Kubernetes workers is a safe option for ensuring very fast data processing, as long as Kafka is stable and bucket storage is working. To achieve a RAID0 setup. We're using AWS's i4i instances, coupled with TopoLVM for an easier volume management.