Deployment for High Availability

When constructing the HumioCluster resource, it is possible to leverage Kubernetes features like pod topology spread constraints, pod affinity and antiAffinity and taints and tolerations. These things can be configured on a per node pool level. These should be configured in a way so that pods are placed on worker nodes where pods are spread out uniformly across availability zones. So if we have a node pool with nodeCount 9 and worker nodes in 3 availability zones, the goal should be to place 3 pods in each availability zone.

It is possible to configure a node pool per availability zone, and use affinity and/or tolerations so pods for a specific node pool are placed on worker nodes in a specific availability zone. However, in most cases it is easier to have one node pool scheduled to multiple availability zones. To make sure LogScale pods knows the correct availability zone, we leverage an init container in the cluster pods. This init container gets the Kubernetes worker node name and looks up the availability zone using well-known labels on the worker node resource in the Kubernetes cluster. The zone information collected by the init container gets automatically passed to the LogScale pod by configuring the ZONE configuration option for the LogScale pod.

With zone information available to LogScale, it means LogScale will use this zone when making decisions, e.g. how to configure digest partitions and where to place segments.