Operations Guide

This is the operations guide.

Two clusters are managed using separate Terraform state files:

Primary (Region A): production, dr="active".
Secondary (Region B): standby, dr="standby", minimal capacity, reads the primary's Azure Blob Storage container using the exact same encryption key pulled from remote state, and keeps all LogScale pods scaled to zero until a failover/promotion is initiated.

Region flexibility:

Primary and secondary clusters must be deployed in Azure paired regions. Storage VNet service endpoint ACLs require paired regions for cross-region access. Update azure_resource_group_region in your tfvars, the remote state configuration, and any region-specific references (for example Traffic Manager/DNS) to match your chosen regions.

Key features:

Automated encryption key synchronization (no hardcoding). Standby apply requires the primary key (remote state or explicit value).
Cross-region storage access via primary storage firewall update (secondary NAT GW IP) + primary storage account key from remote state (AZURE_RECOVER_FROM_ACCOUNTKEY). Terraform also assigns Storage Blob Data Reader to the standby AKS managed identity.
Alerts toggle automatically via ENABLE_ALERTS based on dr (true for active, false for standby).
Standby keeps Humio Operator scaled to 0; Azure Function (or manual) scales the operator to 1 on failover. NodeCount is already set to 1 on the HumioCluster manifest; running terraform apply on the secondary automatically resets the operator to 0 replicas when dr="standby". Manual, controlled promotion by changing dr and applying Terraform.
Manual, controlled promotion by changing dr and applying Terraform.

Key capabilities:

Feature	Primary (Active)	Secondary (Standby)
Region	Region A (e.g., centralus)	Region B (e.g., eastus2)
Cluster Type	Advanced (full production)	Standby (Humio operator off)
Node Pools	All pools per cluster_size (system/digest/ingress/ingest/ui/kafka)	System/digest/ingress/kafka only; UI and Ingest node pools not created
Humio `nodeCount`	`cluster_size` digest count	`nodeCount=1` declared, but no pods run until operator is scaled up
Humio operator	1 replica	0 replicas until failover
Replication Factor	Production value	1 (overridden)
Auto Rebalance	Enabled	Disabled
Storage Container	`terraform output -raw storage_acct_container_name`	`terraform output -raw storage_acct_container_name`
Encryption Key	Generated on first deploy	Pulled from primary state (required for standby apply)
Terraform Workspace	primary	secondary
DR Mode	`dr = "active"`	`dr = "standby"`