Operations Guide

This guide covers setting up, verifying, and operating disaster recovery (DR) for LogScale on Google Kubernetes Engine (GKE). It assumes familiarity with Terraform, GKE, and the LogScale platform.

The DR architecture uses an active-standby model across two GCP regions:

The primary cluster handles all production traffic (ingest, search, UI).
The secondary cluster runs a minimal-footprint standby: Kafka brokers, cert-manager, and the humio-operator deployment (scaled to 0 replicas). No LogScale pods run on standby — the operator being at zero replicas prevents any pod creation regardless of the HumioCluster CR spec.
Failover can be triggered three ways:
1. Beta feature Automated (GLB + Cloud Function) — Uptime check detects primary failure, Cloud Function scales operator and flips GLB capacity.
2. Beta feature Manual (GLB) — Operator scaled to 0 on primary, GLB capacity_scaler flipped manually via gcloud or Terraform.
3. Beta feature DNS WRR routing — Cloud DNS weighted round-robin with manual weight change.
Encryption keys are synchronized to the secondary cluster via Terraform remote state references.
The secondary cluster has read-only cross-region access to the primary's GCS bucket for snapshot recovery. This applies when primary bucket data is not replicated to the secondary bucket (e.g., via GCS Transfer Service or dual-region storage). If replication is configured, set GCP_RECOVER_FROM_REPLACE_BUCKET to rewrite segment paths to the local copy instead. Cross-region read path and replicated bucket path are Beta features.
Promotion from standby to active uses a two-phase pool routing switch to avoid traffic blackhole during service selector changes. This does NOT guarantee zero data loss — events in flight during the failover window (30-60s GLB detection + CF trigger delay) may be lost. RPO depends on client-side retry and buffering capabilities.

Cluster Types

Type	Node Pools	Use Case
basic	Digest + Kafka	Dev/test, small workloads
dedicated-ui	Digest + UI + Kafka	Separate UI serving from query processing
advanced	Digest + UI + Ingest + Kafka	Full production topology with dedicated ingest

Cluster Sizes

Size	Digest Nodes	Machine Type	Estimated Daily Ingest
xsmall	3	n2-highmem-16	Up to 1 TB/day
small	9	n2-highmem-16	1-5 TB/day
medium	21	n2-highmem-32	5-20 TB/day
large	42	n2-highmem-32	20-50 TB/day
xlarge	78	n2-highmem-64	50+ TB/day

Network Access

Setting	API Access
`kubernetes_private_cluster_enabled = true`	Internal networks only
`kubernetes_private_cluster_enabled = false + ip_ranges_allowed_to_kubeapi = []`	GCP public CIDRs only
kubernetes_private_cluster_enabled = false + ip_ranges_allowed_to_kubeapi = ["0.0.0.0/0"]	Unrestricted

See Google's network isolation guide for recommended configurations.

Quick Start

terraform

module "logscale" {
  source = "."

  project_id             = "my-project"
  region                 = "us-central1"
  zone                   = "us-central1-a"
  logscale_cluster_type  = "basic"
  logscale_cluster_size  = "xsmall"
  public_url             = "logscale.example.com"

  # Versions (minimum supported)
  humio_operator_chart_version = "0.29.2"
  humio_operator_version       = "0.29.2"
  logscale_image_version       = "1.207.0"
  strimzi_operator_version     = "0.45.0"
}

Versions of this Page

Deployment Overview

Planning Your Deployment

Instance Sizing

Authentication and identity providers

Storage Architecture

Installing Using Containers

Installing On Bare Metal or Cloud Instance

Reference Architectures

Installing Load Balancers

Deploying Auxiliary Services

Configuration Settings

Managing Your Deployment

Testing Your Deployment

Operations Guide

Cluster Types

Cluster Sizes

Network Access

Quick Start

Enter search term