Proposal

This section states the intent, audience, and boundaries of the DR runbook.

Objective

Provide a clear, repeatable procedure to configure LogScale DR (primary + secondary) with Terraform, validate the secondary is ready, and promote it to active when required.

Audience

DevOps engineers with OCI and Terraform access.

Scope

  • In scope:

    • Terraform workspaces and DR-specific variables (dr, dr_primary_bucket_name, primary_remote_state_config, etc.).

    • Encryption-key synchronization via remote state and Kubernetes secrets.

    • DR environment variables on the HumioCluster (S3_RECOVER_FROM_*, ENABLE_ALERTS, etc.).

    • Verification steps, failover flow (Humio pod + snapshot), and promotion from standby to active.

  • Out of scope:

    • Foundational OCI networking/bootstrap (VCN, subnets, base OKE cluster, shared DNS/account setup).

    • Automatic replication of LogScale assets (repositories, saved searches, alerts, dashboards, widgets, etc.) between clusters.

    • Automatic synchronization of Kubernetes secrets referenced by the HumioCluster (OAuth/SAML, SMTP/Postmark, ingest tokens, custom application secrets).

    • Client application changes or reconfiguration of producers/senders beyond pointing them at the global DR FQDN.

Success criteria

  • With dr="standby", the secondary cluster is fully provisioned (Kafka, ingress, cert-manager) and ready to start LogScale on failover; LogScale pods remain scaled to 0 until the operator is scaled up.

  • After failover (operator scaled 0→1), the secondary LogScale pod fetches the global snapshot from the PRIMARY Object Storage bucket and becomes Ready.