Kubernetes Deployment Requirements
There are certain requirements that must be met before deploying LogScale:
LogScale Requirements
Bucket Storage
When running LogScale in production it is recommended that the instances running LogScale have local NVMe storage. Depending on the environment in which LogScale is being deployed these disks may be ephemeral, such as is the case with AWS's instance-store instances or Google's local SSD's. When utilizing ephemeral instances bucket storage is required for a production environment as it acts as the persistent storage for the cluster.
Kubernetes
The minimum supported Kubernetes version supported by the Humio Operator can be found at Version Matrix
Any Kubernetes platform is supported but implementation and usage of some features may be different.
Kafka
LogScale relies on Kafka as a fault tolerant event bus and internal cluster communication system. The minimum supported version of Kafka can be found here: Kafka Version
In general, we recommend using the latest Kafka version possible on the given environment.
TLS
By default the Humio Operator utilizes cert-manager to create an internal certificate authority for use by the LogScale cluster. In addition, support for provisioning certificates for external connectivity can be used in conjunction with cert-manager's external issuer support. If LogScale is configured to expose its APIs using HTTPS, which is the default, LogScale assumes Kafka connectivity will also use TLS; this is configurable. In some environments that employ service meshes that implement TLS or mTLS, TLS support can be disabled completely.
Kubernetes Environment Requirements
The components of a Kubernetes cluster will vary depending on the environment but this generally includes building blocks such as DNS, ingress, ingress controllers, networking, and storage classes. Depending on the different components of the Kubernetes cluster, some implementations details may vary and care must be taken during implementation. For example if the environment is using a service mesh that provides TLS connectivity for pods and services, the TLS provisioning feature built into the operator should be disabled.
For a production cluster, an ingress controller that has a routing rule to the humio service is required. The TLS configuration for this controller and for the Humio service will differ if the environment is utilizing an internal PKI implementation or using the built in automation the operator provides with cert-manager.
LogScale is a highly multi-threaded piece of software, and we recommend running a single cluster pod per underlying Kubernetes worker node. This can be achieved using Kubernetes pod anti-affinity. The reason for running one cluster pod per Kubernetes worker node is to let LogScale prioritize what threads gets priority, and to limit disk access so only one cluster pod is able to access the filesystem of a cluster pod. Multiple cluster pods must not use the same data directory, and ensuring they run on separate machines helps achieve that.
Kubernetes Worker Requirements
All Kubernetes worker nodes should have labels set for the zones they are located in. This should be done using the worker node label topology.kubernetes.io/zone. With this label available on the underlying worker nodes, it means that the LogScale cluster pods will become zone aware and use the zone information e.g. distribute/replicate data. It is important to consider the best strategy for placing worker nodes across zones so the LogScale cluster pods gets scheduled to worker nodes uniformly across multiple availability zones, data centers, or racks.
There's two overall paths for configuring disks for LogScale. One is a combination of ephemeral storage and bucket storage, and the other is to use network attached block storage. We recommend using ephemeral disks and bucket storage for production clusters if bucket storage is available. The operator does not partition, format, and mount the local storage on the worker instance, here is an example of using AWS's user-data script to accomplish this. The recommendation is to not mix instance types/sizes within the same pool of LogScale nodes that sharing the same configuration. It is not recommended to use spot instances as some cloud providers offer.
Ephemeral disk & bucket storage
Worker nodes running LogScale cluster pods must be configured with fast
local storage, such as NVMe drives. If there are multiple NVMe drives
attached to the machine, they can be combined using RAID-0. Preparing
disks for use can be done using features like
user
data. It is important to configure LogScale to know that the
disks are ephemeral by ensuring the environment variables include
USING_EPHEMERAL_DISKS=true
. This configuration makes
LogScale able to make better and safer decisions managing data in the
cluster. We can use hostPath mounts to access the locally-attached fast
storage from the cluster pods, or potentially any other Kubernetes
storage provider which grants direct unrestricted access to the
underlying fast storage.
Local PVCs are supported by the Humio operator. This alternative to hostPath does not require the formatting and raiding of multiple disks but does require initial setup that is dependent on the Kubernetes environment.
It is technically possible to run with emptyDir
volume types on the cluster pods, but that is not recommended since the
lifecycle of emptyDir
volumes follows the
lifecycle of the pods. Upgrading LogScale clusters or changing
configurations, will replace cluster pods which would wipe the data
directories for LogScale cluster pods every time, which is not desired.
The cluster should try and reuse ephemeral disks as much as possible,
even if data is technically safe elsewhere, since performance will take
a huge hit every upgrade/restart if data directories are not reused
because LogScale would need to fetch everything from bucket storage
again.
Network block storage
Using this type of underlying storage for LogScale clusters is generally much slower and more expensive compared to the alternative above, i.e. using the combination of ephemeral disks and bucket storage. The typical use of network block storage for Kubernetes pods will dynamically create disks and attach them to the Kubernetes worker nodes as needed. Network block storage should not be used for I/O heavy LogScale nodes, like storage and digest. However, it can be used for ingest, query coordinators and UI/API nodes.