GCP Deployment Prerequisites

Before following this guide, there are some basic prerequisites and tooling:

GCP Required Tool Components

The following tools are required to follow this architecture guide:

Logscale on GCP Requirements

The following requirements exist for any LogScale deployment:

  • Bucket Storage

    GCP provides NVMe storage in the form of local SSDs, which are directly attached to the virtual machine instances. Local SSDs offer high Input/Output Operations Per Second(IOPS) and low latency. When utilizing ephemeral instances bucket storage is required for a production environment as it acts as the persistent storage for the cluster.

  • Kubernetes

    The minimum Kubernetes version supported by the Humio Operator can be found Version Matrix.

  • Strimzi Operator

    Strimzi Operator. You can install strimzi operator using helm.

    LogScale relies on Kafka as a fault tolerant event bus and internal cluster communication system. You must have an available Kafka cluster before deploying LogScale.

    See the Deploying and Upgrading Strimzi guide for more info:

    The recommended deployment uses Rack awareness in Kafka configs (topology.kubernetes.io/zone label) to spread replicas across different racks, data centers, or availability zones.

  • TLS

    By default the Humio Operator utilizes cert-manager to create an internal certificate authority for use by the LogScale cluster. In addition, support for provisioning certificates for external connectivity can be used in conjunction with cert-manager's external issuer support. If LogScale is configured to expose its APIs using HTTPS, which is the default, LogScale assumes Kafka connectivity will also utilize TLS, this is configurable. In some environments that employ service meshes that implement TLS or mTLS, TLS support can be disabled completely.

  • Nginx Ingress

    The humio-operator contains one built-in ingress implementation, which relies on ingress-nginxto expose the cluster to the outside of the Kubernetes cluster. The built-in support for ingress-nginx should be seen mostly as a good starting point and source of inspiration, so if it does not match certain requirements it is possible to point alternative ingress controllers to the "Service" resource(s) pointing to the cluster pods. The built-in support for ingress-nginx only works if there is a single node pool with all nodes performing all tasks.

  • Topo-lvm for preparing NVMe disks

    HumioCluster resources assumes disks are prepped on the underlying k8s worker nodes. We use raid0 on the local SSDs (or as GCP calls them, ephemeral local SSD), in combination with bucket storage. So as long as Kafka is stable and bucket storage is working, then using raid0 on the individual k8s workers is fine. TopoLVM provides dynamic volume provisioning using LVM, making it easier to manage disk space for kubernetes pods.

  • Setup Workload Identity for Google Cloud Storage

    Workload Identity allows us to associate a Google Kubernetes Engine (GKE) service with a specific Google Cloud service account. This minimizes the need to embed GCS credentials directly in our app or pod configs, reducing the risk of exposure. Service account keys are long-lived credentials that, if compromised, could lead to security risks. With Workload Identity, there's no need to manually rotate service account keys. GKE manages the credentials automatically, reducing administrative overhead

  • Instance Sizing

    When sizing LogScale, your choice depends on your usage patterns, so we recommend first doing an example setup to see how LogScale works with your data. Examples of recommended deployment configurations are documented in Instance Sizing.