Terraform Configuration

This section covers the DR-specific Terraform modules, workspace setup, and deployment sequence for the secondary (standby) cluster.

Key DR mechanisms managed by Terraform:

Encryption key synchronization -- the primary generates the key; the secondary copies it via TFE outputs or remote state. See S3 Storage for DR.
Automated failover -- a Lambda function scales the Humio operator from 0 to 1 when the primary becomes unhealthy. See DR Failover Lambda (module.dr-failover-lambda) for the full event chain, timing, and configuration options.
Health check FQDN locking -- during failover, the Lambda swaps the primary Route53 health check FQDN to failover-locked.invalid to prevent automatic DNS failback. See DNS Architecture and Traffic Flow.
S3_RECOVER_FROM_* environment variables are set on the standby cluster at provisioning time but only consumed when the LogScale pod starts during failover.

DR Modules

Three DR-specific modules automate failover operations. They are gated by manage_global_dns and dr_failover_lambda_enabled flags and are not deployed in standalone dr="" mode.

Global DNS (`module.global-dns`)

Provides automatic traffic failover between primary and secondary clusters using Route53 Failover Routing. Deployed on the primary cluster only (manage_global_dns = true, requires dr = "active").

Key resources:

Resource	Purpose
`aws_route53_health_check` (primary)	HTTPS health check on /api/v1/status (interval=10s, threshold=3)
`aws_route53_health_check` (secondary)	TCP health check on port 443 (interval=30s, threshold=2)
`aws_route53_record` (primary)	PRIMARY failover record → primary ALB
`aws_route53_record` (secondary)	SECONDARY failover record → secondary ALB

Global DNS routing:

Important

Both clusters must use the same global_logscale_hostname value. Mismatched values cause HTTP 404 errors on failover.

For details on failback prevention (FQDN locking), ExternalDNS annotation requirements, and DNS configuration by DR mode, see Global DNS Details and DNS Architecture — FQDN Locking Details.

DR Failover Lambda (`module.dr-failover-lambda`)

Automatically scales humio-operator from 0 → 1 on the secondary cluster when the primary becomes unhealthy, and locks the primary health check to prevent DNS failback. Deployed on the standby cluster only (dr = "standby", dr_failover_lambda_enabled = true).

Key resources:

Resource	Purpose
`aws_lambda_function`	Python 3.12 — scales operator and locks primary health check
`aws_cloudwatch_metric_alarm`	Fires when primary health check becomes unhealthy
`aws_sns_topic` + subscription	Connects alarm → Lambda
`aws_iam_role` + policies	EKS API, Route53, and KMS access
`aws_eks_access_entry`	Kubernetes RBAC for operator scaling
`aws_kms_key`	Encryption for Lambda environment variables

Failover chain: Health Check fails → CloudWatch Alarm → SNS → Lambda validates failure duration → cleans stale TLS secrets → scales operator 0 → 1 → locks primary health check FQDN → Operator reconciles HumioCluster → LogScale pod recovers from primary bucket.

Key configuration (tfvars):

Variable	Default	Description
`dr_failover_lambda_pre_failover_failure_seconds`	180	Minimum seconds primary must be failing before failover (0 for testing)
`dr_failover_lambda_enabled`	true	Enable/disable Lambda deployment
`dr_failover_lambda_timeout`	60	Lambda execution timeout (seconds)

Health check IDs are auto-resolved from primary remote state — no manual tfvars entry needed. For the full variable list, internal defaults, EKS access details, and retry logic, see Lambda Function Internals and Lambda Configuration Details.

S3 Storage for DR

The primary writes to its own S3 bucket. The secondary reads the primary bucket during recovery via S3_RECOVER_FROM_* environment variables and uses its own bucket for new writes.

Encryption key synchronization: Primary generates the key (random_password) and exports it as a sensitive output. Secondary reads it via TFE outputs or terraform_remote_state and stores it in the <cluster-name>-s3-storage-encryption Kubernetes secret.

For cross-region IAM policy details and security controls, see S3 Storage for DR — Implementation Details.

EKS Node Group Topology — DR Modes

Node Group	Primary (dr="active")	Standby (dr="standby")	Purpose
Digest	Deployed	Deployed	Core LogScale processing
Kafka	Deployed*	Deployed*	Kafka broker nodes
Ingress	Deployed	Deployed	Load balancer / ingress
UI	Deployed	Not created	Web UI serving
Ingest	Deployed†	Not created	High-volume ingestion

* When provision_kafka_servers = true.

† When cluster_type = "advanced". UI and Ingest node groups are omitted on standby to reduce cost; they are created during promotion via terraform apply.

Component	Active	Standby
Humio operator replicas	1	0
HumioCluster `nodeCount`	`cluster_size`	1 (declared, not running)
Replication factor	Production	1
S3 `force_destroy`	`false`	`true`

For the full topology comparison including non-DR mode, see EKS Node Group Topology — DR Modes.

Workspace Setup for DR Pairs

DR deployments require two Terraform workspaces: one for the primary cluster and one for the secondary. The workspace names used below (primary and secondary) are illustrative - you can choose any names that suit your environment.

First-time setup (create both workspaces):

shell

# 1. Initialize with primary backend config (first time only)
terraform init -backend-config=backend-configs/primary-aws.hcl
# 2. Create the primary workspace (only needed once)
terraform workspace new primary
# 3. Switch to secondary backend config
terraform init -backend-config=backend-configs/secondary-aws.hcl -reconfigure
# 4. Create the secondary workspace (only needed once)
terraform workspace new secondary

Switching between cluster workspaces:

shell

# Switch to primary cluster
terraform workspace select primary
# Switch to secondary cluster
terraform workspace select secondary

Remote State Data Flow

The primary and secondary clusters exchange critical data via terraform_remote_state (or TFE outputs).

Configuration: The secondary cluster's primary_remote_state_config must specify workspace and config.key matching the primary's backend config. For S3 backends, locals.tf constructs the full state path (env:/<workspace>/<key>) automatically — see Terraform Configuration for details on the workspace path workaround.

Secondary reads from primary:

Data	Output Name	Purpose
Encryption key	`s3_storage_encryption_key`	Decrypt/encrypt data in both buckets
Encryption key K8s secret name	`s3_encryption_key_secret_name`	Name of the K8s Secret containing the encryption key
Bucket name	`s3_bucket_id`	`S3_RECOVER_FROM_BUCKET`
Bucket region	`s3_bucket_region`	`S3_RECOVER_FROM_REGION`
Health check ID (primary)	`primary_health_check_id`	Lambda monitors this health check
Health check ID (secondary)	`secondary_health_check_id`	Used for failover DNS routing

Note

Health check IDs are automatically resolved from remote state when primary_remote_state_config is set. You do not need to manually specify dr_primary_health_check_id or dr_secondary_health_check_id in your tfvars. The resolution priority is: explicit tfvars variable > remote state from primary > empty string (same pattern as s3_storage_encryption_key).

Module Deployment Matrix

Module deployment matrix:

Module	`dr=""`	`dr="active"`	`dr="standby"`	Notes
`module.vpc`	Yes	Yes	Yes	VPC, subnets, NAT gateways
`module.eks`	Yes	Yes	Yes	EKS cluster + node groups
`module.pre-install`	Yes	Yes	Yes	Namespaces, encryption secret
`module.logscale`	Yes (operator replicas: 1)	Yes (operator replicas: 1)	Yes (operator replicas: 0)	Kafka, Nginx, HumioCluster
`module.global-dns`	Always instantiated; resources gated by `manage_global_dns=true` (requires `dr="active"`)	Resources deploy when `manage_global_dns=true`	Instantiated but precondition blocks `manage_global_dns=true`	Route53 zone, health checks, failover records
`module.dr-failover-lambda`	No	No	When `dr_failover_lambda_enabled=true`	Lambda, alarm, SNS

Notes:

DR module conditions: module.global-dns is always instantiated (no count guard) but its resources only deploy when manage_global_dns=true, which requires dr="active" (enforced by precondition). module.dr-failover-lambda only deploys when dr="standby" and dr_failover_lambda_enabled=true.
Keep manage_global_dns=true only in a single workspace to avoid two states managing the same failover records/zone.

Module Dependency Graph

Follow this order to apply Terraform safely and avoid dependency issues.

Each module references outputs from upstream modules. The diagram below shows the dependency order -- modules must be deployed top-to-bottom. Deploying out of order will result in missing references or Terraform errors.

Note

module.eks creates the S3 bucket, IAM roles, and ACM certificates that module.pre-install and module.logscale consume. When both modules are included in the same targeted apply (-target), Terraform resolves this dependency automatically and creates eks first.

Primary Cluster - DR-Specific Settings

The primary cluster is provisioned as usual with dr="active".

Minimal primary-us-west-2.tfvars:

terraform

dr = "active"
aws\_region = "us-west-2"
cluster\_name = "dr-primary"
# Global DNS (only on primary)
manage\_global\_dns = true
global\_logscale\_hostname = "logscale-dr"
primary\_logscale\_hostname = "logscale-dr-primary"
secondary\_logscale\_hostname = "logscale-dr-secondary"
zone\_name = "<your-domain.example.com>"

Commands:

shell

# Select the primary workspace (terraform init already completed)
terraform workspace select primary # or: terraform workspace new primary
terraform init -backend-config=backend-configs/primary-aws.hcl
# 1. VPC, subnets, NAT gateway, security groups
terraform apply -var-file=primary-us-west-2.tfvars \
-target="module.vpc"
# 2. EKS cluster, node groups, S3 bucket, IAM roles, ACM certificate
terraform apply -var-file=primary-us-west-2.tfvars \
-target="module.eks"
# 3. Pre-install (namespace, encryption key secret, ALB controller, ExternalDNS)
terraform apply -var-file=primary-us-west-2.tfvars \
-target="module.pre-install"
# 4. CRDs (cert-manager, strimzi, humio-operator CRDs must exist before LogScale resources)
terraform apply -var-file=primary-us-west-2.tfvars \
-target="module.logscale.module.crds"
# 5. LogScale application stack (Strimzi Kafka, cert-manager, Nginx, HumioCluster)
terraform apply -var-file=primary-us-west-2.tfvars \
-target="module.logscale"
# 6. Global DNS -- Route53 health checks and failover records (primary only)
terraform apply -var-file=primary-us-west-2.tfvars \
-target="module.global-dns"
# Final: full apply to ensure all resources are in sync
terraform apply -var-file=primary-us-west-2.tfvars

Verify:

shell

aws eks describe-cluster --name dr-primary --region us-west-2 --query 'cluster.tags.dr'
# => "active"
terraform output
# shows s3\_bucket\_id, s3\_bucket\_region, and a sensitive s3\_storage\_encryption\_key

Workspace Safety Validation

Existing precondition blocks in the Terraform modules prevent dangerous cross-workspace misconfigurations at plan time:

Module	Precondition	Blocks if
global-dns	`manage_global_dns` requires `dr="active"`	Standby cluster tries to manage DNS records
pre-install/s3	`dr="standby"` requires `existing_s3_encryption_key`	Standby applied without primary encryption key

Recommended additional guard: add a workspace_name variable to each tfvars file and a matching precondition that checks terraform.workspace. The example.tfvars already includes a commented workspace_name field for this purpose.

terraform

# primary-us-west-2.tfvars
workspace\_name = "primary"
dr = "active"
# secondary-us-east-2.tfvars
workspace\_name = "secondary"
dr = "standby"

If implemented, Terraform will block any apply where the workspace does not match:

text

WORKSPACE MISMATCH - EXECUTION BLOCKED
Current workspace: 'default'
tfvars workspace: 'secondary'
Fix: terraform workspace select secondary
OR use the correct tfvars file for 'default' workspace

Secondary Cluster Deployment

The secondary cluster deploys the same shared infrastructure modules plus the DR failover Lambda. Set dr = "standby" in your tfvars. The standby cluster reads the primary's state to obtain storage credentials, encryption keys, and health check IDs — all automatically via terraform_remote_state when primary_remote_state_config is configured.

Minimal secondary-us-east-2.tfvars:

terraform

dr = "standby"
aws\_region = "us-east-2"
cluster\_name = "dr-secondary"
# Global DNS hostname (must match primary)
global\_logscale\_hostname = "logscale-dr"
primary\_logscale\_hostname = "logscale-dr-primary"
secondary\_logscale\_hostname = "logscale-dr-secondary"
zone\_name = "<your-domain.example.com>"
manage\_global\_dns = false # Important: avoid two states managing global DNS
# Remote state to fetch primary outputs
primary\_remote\_state\_config = {
  backend = "s3"
  workspace = "primary"
  config = {
    bucket = "logscale-tf-backend"
    key = "env:/logscale-aws-eks"
    region = "us-west-2"
    profile = "your-aws-profile"
    encrypt = true
  }
}
# Recovery hints (fallback if remote state is unavailable)
s3\_recover\_from\_region = "us-west-2"
s3\_recover\_from\_bucket = "<primary-bucket-name>"
s3\_recover\_from\_encryption\_key\_secret\_name = "dr-secondary-s3-storage-encryption"
s3\_recover\_from\_encryption\_key\_secret\_key = "s3-storage-encryption-key"

Auto-resolved from remote state (no need to set in tfvars when primary_remote_state_config is configured):

s3_storage_encryption_key — fetched as existing_s3_encryption_key
primary_health_check_id and secondary_health_check_id — used by Lambda and CloudWatch alarm

Important

primary_remote_state_config alignment: The workspace and config.key values must exactly match the primary cluster's backend configuration (see Terraform Configuration). Misaligned values cause terraform_remote_state to read the wrong state file, resulting in a different encryption key. This causes AEADBadTagException when the secondary LogScale pod tries to decrypt the global snapshot. Verify by comparing encryption key hashes:

shell

# These must produce identical hashes
kubectl get secret -n logging dr-primary-s3-storage-encryption --context dr-primary -o jsonpath='{.data.s3-storage-encryption-key}' | base64 -d | shasum -a 256
kubectl get secret -n logging dr-secondary-s3-storage-encryption --context dr-secondary -o jsonpath='{.data.s3-storage-encryption-key}' | base64 -d | shasum -a 256

Deployment sequence:

shell

terraform workspace select secondary
terraform init -backend-config=backend-configs/secondary-aws.hcl

terraform apply -var-file=secondary-us-east-2.tfvars -target="module.vpc"
terraform apply -var-file=secondary-us-east-2.tfvars -target="module.eks"
terraform apply -var-file=secondary-us-east-2.tfvars -target="module.pre-install"
terraform apply -var-file=secondary-us-east-2.tfvars -target="module.logscale.module.crds"
terraform apply -var-file=secondary-us-east-2.tfvars -target="module.logscale"
terraform apply -var-file=secondary-us-east-2.tfvars -target="module.dr-failover-lambda"
terraform apply -var-file=secondary-us-east-2.tfvars  # final full apply

Standby Readiness Checklist:

Check	Command	Expected
Humio operator scaled to 0	`kubectl --context dr-secondary -n logging get deploy humio-operator`	replicas: 0
Kafka pods running	`kubectl --context dr-secondary -n logging get pods \| grep kafka`	All pods Running
Ingress has ALB	`kubectl --context dr-secondary -n logging get ingress`	ALB address assigned
S3 recovery env vars set	`kubectl --context dr-secondary -n logging get humiocluster -o yaml \| grep S3_RECOVER`	Env vars present
Encryption keys match	Compare `shasum -a 256` output above	Identical hashes
Lambda exists	`aws lambda get-function --function-name <prefix>-handler --region us-east-2`	Function listed

Note

Note: Kafka must be running before LogScale starts. Strimzi generates the Kafka TLS truststore secret only after Kafka is up — if LogScale starts before this secret exists, the pod crashloops. For the full standby topology (which node groups and pods are running vs. not), see EKS Node Group Topology — DR Modes.

Accessing the Clusters

Terraform does not require a kubeconfig file -- the Kubernetes and Helm providers read EKS credentials directly from module.eks outputs. Cluster-specific kubeconfig files are auto-generated on terraform apply as kubeconfig-<cluster-name>.yaml in the repository root (git-ignored).

shell

# Single cluster access
export KUBECONFIG=./kubeconfig-dr-primary.yaml
kubectl get nodes

DR dual-cluster access:

shell

# Merge both kubeconfigs (one per workspace)
export KUBECONFIG=./kubeconfig-dr-primary.yaml:./kubeconfig-dr-secondary.yaml
# Use contexts (context name = cluster\_name from tfvars)
kubectl --context dr-primary get nodes
kubectl --context dr-secondary get nodes

Note

Note: The kubeconfig uses aws eks get-token with the aws_profile set in your tfvars. Ensure your AWS CLI profile has valid credentials before running kubectl commands.

Kubernetes Access

Ensure kubectl contexts are configured for both clusters:

shell

# Configure contexts (run once)
aws eks update-kubeconfig --name dr-primary --region us-west-2 --alias dr-primary
aws eks update-kubeconfig --name dr-secondary --region us-east-2 --alias dr-secondary
# Verify access
kubectl --context dr-primary cluster-info
kubectl --context dr-secondary cluster-info

Versions of this Page

Deployment Overview

Planning Your Deployment

Instance Sizing

Authentication and identity providers

Storage Architecture

Installing Using Containers

Installing On Bare Metal or Cloud Instance

Reference Architectures

Installing Load Balancers

Deploying Auxiliary Services

Configuration Settings

Managing Your Deployment

Testing Your Deployment

Terraform Configuration

DR Modules

Global DNS (`module.global-dns`)

Important

DR Failover Lambda (`module.dr-failover-lambda`)

S3 Storage for DR

EKS Node Group Topology — DR Modes

Workspace Setup for DR Pairs

Remote State Data Flow

Note

Module Deployment Matrix

Module Dependency Graph

Note

Primary Cluster - DR-Specific Settings

Workspace Safety Validation

Secondary Cluster Deployment

Important

Note

Accessing the Clusters

Note

Kubernetes Access

Enter search term

Deployment Overview

Planning Your Deployment

Instance Sizing

Authentication and identity providers

Storage Architecture

Installing Using Containers

Installing On Bare Metal or Cloud Instance

Reference Architectures

Installing Load Balancers

Deploying Auxiliary Services

Configuration Settings

Managing Your Deployment

Testing Your Deployment

Terraform Configuration

DR Modules

Global DNS (module.global-dns)

Important

DR Failover Lambda (module.dr-failover-lambda)

S3 Storage for DR

EKS Node Group Topology — DR Modes

Workspace Setup for DR Pairs

Remote State Data Flow

Note

Module Deployment Matrix

Module Dependency Graph

Note

Primary Cluster - DR-Specific Settings

Workspace Safety Validation

Secondary Cluster Deployment

Important

Note

Accessing the Clusters

Note

Kubernetes Access

Enter search term

Global DNS (`module.global-dns`)

DR Failover Lambda (`module.dr-failover-lambda`)