Stage 2: Verification

After deploying both clusters, verify the following before considering DR operational.

Note

See the DR Post-Install Checklist page for step-by-step commands with expected outputs.

Encryption Key Sync

The secondary cluster must have the same storage encryption key as the primary. Without this, snapshot recovery will fail silently.

The primary stores its key in <INFRA_PREFIX>-gcp-storage-encryption-key. The standby stores a copy in a recovery secret whose name is configured by gcp_recover_from_encryption_key_secret_name (common names: gcs-storage-encryption-recovery or dr-secondary-gcs-storage-encryption).

shell
# On PRIMARY
kubectl get secret <INFRA_PREFIX>-gcp-storage-encryption-key \
  -n log -o jsonpath='{.data.gcp-storage-encryption-key}' | base64 -d | shasum -a 256

# On STANDBY &mdash; find the recovery secret name first:
kubectl get secrets -n log | grep -iE 'recovery|dr.*encrypt'
# Then hash it:
kubectl get secret <RECOVERY_SECRET> \
  -n log -o jsonpath='{.data.gcp-storage-encryption-key}' | base64 -d | shasum -a 256

# Both SHA256 hashes MUST match. If not, re-run terraform apply on the secondary.

Cross-Region GCS Access

The secondary cluster's service account must have read access to the primary's GCS bucket.

shell
# From SECONDARY cluster's service account
gcloud storage ls gs://logscale-primary-us-central1-your-project/ \
  --impersonate-service-account=<secondary-sa>@<project>.iam.gserviceaccount.com

If access is denied, verify IAM bindings and re-run terraform apply -target=module.gke on the secondary.

Workload Identity

Confirm that the Kubernetes service account on the secondary is annotated with the correct GCP service account.

shell
# On SECONDARY
kubectl get sa <humio-sa> -n log \
  -o jsonpath='{.metadata.annotations.iam\.gke\.io/gcp-service-account}'
# Should output the GCP SA email

Node Pool Topology

shell
# PRIMARY (should have all pools: digest, ui, ingest, kafka)
gcloud container node-pools list --cluster=<primary-cluster> --region=us-central1

# SECONDARY (same GKE node pools as primary &mdash; cost savings from operator at 0, not fewer pools)
gcloud container node-pools list --cluster=<secondary-cluster> --region=us-west1

HumioCluster State

shell
# PRIMARY
kubectl get humiocluster -n log -o yaml | grep -A5 'nodePools'
# Should show dedicated pool definitions for each node role

# SECONDARY
kubectl get humiocluster -n log -o yaml | grep -A5 'nodePools'
# Should show nodePools with nodeCount=1 per pool (ui, ingest-only).
# Pods are NOT created because operator is at 0 replicas.

Recovery Environment Variables

On the standby cluster, verify that recovery environment variables are configured in the HumioCluster CR spec (pods are not running on standby, so verify via CR not pod exec):

shell
kubectl get humiocluster -n log -o jsonpath='{.items[0].spec.environmentVariables}' | \
  python3 -m json.tool | grep -A1 GCP_RECOVER

Expected output (values will match your worker config):

terraform
GCP_RECOVER_FROM_BUCKET=<primary-bucket-name>
GCP_RECOVER_FROM_WORKLOAD_IDENTITY=true
GCP_RECOVER_FROM_REPLACE_REGION=<primary-region>/<standby-region>

GCP_RECOVER_FROM_REPLACE_BUCKET should only be present if bucket replication is configured.

What these do:

Also verify GCP_RECOVER_FROM_REGION is NOT set (GCS does not need it):

shell
kubectl exec -it $POD -n log -- env | grep GCP_RECOVER_FROM_REGION || echo 'NOT SET (correct)'

Global Load Balancer

shell
# Check GLB backend health
gcloud compute backend-services get-health <backend-service-name> --global

# Check DNS resolution
dig +short logscale.yourdomain.com
dig +short dr-primary.yourdomain.com
dig +short dr-secondary.yourdomain.com

All three hostnames should resolve. The global hostname should point to the primary's IP while the primary is healthy.

Cloud Function (if enabled)

shell
gcloud functions describe <function-name> --region=us-west1 --gen2

Verify the function is deployed and its environment variables reference the correct secondary cluster and node pool targets.