Stage 1: DR Configuration Setup

Plan Your Naming

Choose deterministic, region-scoped names for all resources. This avoids collisions and makes cross-region references unambiguous.

terraform
Primary:
  Region:                us-central1
  Infrastructure prefix: logscale-primary
  GCS bucket:            logscale-primary-us-central1-<project-id>

Secondary:
  Region:                us-west1
  Infrastructure prefix: logscale-secondary
  GCS bucket:            logscale-secondary-us-west1-<project-id>

Note

Replace <project-id> with your actual GCP project ID. Bucket names must be globally unique.

Deploy Primary Cluster

Create primary.tfvars:

terraform
project_id            = "your-project"
region                = "us-central1"
infrastructure_prefix = "logscale-primary"
dr                    = "active"
logscale_cluster_type = "advanced"
logscale_cluster_size = "small"

# Deterministic bucket naming
gcs_bucket_name = "logscale-primary-us-central1-your-project"

# Global Load Balancer (optional, for health-based failover)
enable_global_lb      = true
enable_glb_named_port = true

# DNS
manage_global_dns           = true
global_dns_zone_name        = "your-dns-zone"
global_logscale_hostname    = "logscale"
primary_logscale_hostname   = "dr-primary"
secondary_logscale_hostname = "dr-secondary"
public_dns_zone_name        = "your-public-dns-zone"
public_url                  = "logscale.yourdomain.com"

# Cross-region: primary needs to know secondary's bucket name
dr_primary_gcs_bucket = "logscale-secondary-us-west1-your-project"

# Versions (same on both clusters)
humio_operator_chart_version = "0.29.2"
humio_operator_version       = "0.29.2"
logscale_image_version       = "1.228.1"
# ... (all other version variables)

Deploy in targeted order:

terraform
terraform init
terraform apply -target=module.vpc
terraform apply -target=module.gke
# ... (continue with remaining modules per the setup guide)
terraform apply -target=module.global_lb          # GLB for DR
terraform apply -target=module.dns_failover       # DNS records

Important

Deploy modules in dependency order. The GLB and DNS modules depend on the GKE cluster and its services being ready.

Deploy Secondary (Standby) Cluster

Create secondary.tfvars:

terraform
project_id            = "your-project"
region                = "us-west1"
infrastructure_prefix = "logscale-secondary"
dr                    = "standby"
logscale_cluster_type = "advanced"   # MUST match primary &mdash; cost savings come from operator at 0 replicas, not cluster type
logscale_cluster_size = "xsmall"

# Deterministic bucket naming
gcs_bucket_name = "logscale-secondary-us-west1-your-project"

# GLB named port (required so primary GLB can route to secondary)
enable_glb_named_port = true

# Remote state: read primary's outputs for encryption key sync
primary_remote_state_config = {
  backend   = "gcs"
  workspace = "default"
  config = {
    bucket = "your-tf-state-bucket"
    prefix = "logscale/gcp/primary/terraform/tf.state"
  }
}

# Recovery configuration
dr_primary_gcs_bucket          = "logscale-primary-us-central1-your-project"
gcp_recover_from_bucket        = "logscale-primary-us-central1-your-project"
gcp_recover_from_replace_region = "us-central1/us-west1"

# Cloud Function for automated failover (optional)
dr_cloud_function_enabled                      = true
dr_cloud_function_target_node_count            = 2
dr_cloud_function_pre_failover_failure_seconds = 180

# Versions (MUST match primary)
humio_operator_chart_version = "0.29.2"
humio_operator_version       = "0.29.2"
logscale_image_version       = "1.228.1"

public_url = "logscale.yourdomain.com"

Deploy in targeted order:

terraform
terraform init
terraform apply -target=module.vpc
terraform apply -target=module.gke
# ... (targeted deployment per the setup guide)
terraform apply -target=module.dr_failover_function  # Cloud Function

GLB Backend Registration

The standby cluster self-registers its instance groups into the primary's GLB backend service on first deploy (via terraform_data.glb_self_register). No primary redeploy is required.

Verify both backends are registered after standby deploy:

shell
gcloud compute backend-services get-health <BACKEND_SERVICE_NAME> \
  --global --format='table(status.healthStatus[].ipAddress,status.healthStatus[].healthState)'
# Expected: 2+ IPs (primary HEALTHY, standby UNHEALTHY &mdash; standby has no LogScale pods)

If only one backend appears, check that enable_glb_named_port = true and primary_remote_state_config are set on the standby worker.

Similarly, after a standby-to-active-to-standby round-trip (e.g., DR test followed by failback), the encryption key recovery secret may be empty. Verify:

shell
# On STANDBY &mdash; must NOT be empty (SHA256 of empty = e3b0c44298fc...)
kubectl get secret <RECOVERY_SECRET> -n log \
  -o jsonpath='{.data.gcp-storage-encryption-key}' | base64 -d | shasum -a 256

If the hash is e3b0c44298fc1c149afbf4c8996fb924..., the key is empty and DR recovery will fail. Redeploy the standby to re-read the primary's encryption key from remote state.