Why nodePools = null on Standby

When dr="standby", the HumioCluster spec sets nodePools = null to prevent the humio-operator from entering a reconciliation loop. The shared logscale-kubernetes module generates nodePool specs for all pool types (digest, UI, ingest) with nodeCount=0 for pools not deployed on standby. The operator interprets these zero-count pools as stale status entries, cleaning them up each cycle and preventing the digest pod from being created.

Important

nodePools is tied to var.dr == "standby", not to dr_use_dedicated_routing.

During two-phase promotion:

  • Phase 1 (dr="active", dr_use_dedicated_routing=false): nodePools are restored so UI/Ingest pods begin scaling up

  • Phase 2 (dr_use_dedicated_routing=true): Pool-specific selectors are enabled once pods are ready

Nulling nodePools during Phase 1 would cause a 503 outage in Phase 2 because selectors would update instantly but pods would take minutes to start.

Terraform Implementation

The node pool logic is in the EKS node groups module.

The node group configuration uses a conditional expression based on the dr variable. When dr="standby", only digest, kafka, and ingress node groups are deployed. When dr="active", the full set of node groups (including UI and ingest) is deployed based on the logscale_cluster_type setting.