Upgrading Humio Operator on Kubernetes

Kubernetes

The upgrade procedure for the Humio Operator depends on how the Operator was installed. If the installation was performed using helm with installCRDs=true, then follow the helm upgrade only. Otherwise upgrade the CRDs first and then the helm chart.

The version of the Helm Chart should match that of the Humio Operator. For this reason, it is not recommended to change the image.tag of the Humio Operator helm chart and instead update the chart to the desired version.

Upgrading the CRD’s

Obtain the version from Releases.

shell
export HUMIO_OPERATOR_VERSION=x.x.x
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humioclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humioexternalclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humioingesttokens.yaml
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humioparsers.yaml
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humiorepositories.yaml
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humioviews.yaml
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humioalerts.yaml
kubectl apply -f https://raw.githubusercontent.com/humio/humio-operator/humio-operator-${HUMIO_OPERATOR_VERSION}/config/crd/bases/core.humio.com_humioactions.yaml

Note: It is possible to skip this step if using installCRDs=true when installing the Helm Chart. This is not recommended because uninstalling the helm chart will remove the custom resources.

Helm

shell
helm upgrade humio-operator humio-operator/humio-operator \
  --namespace logging \
  --version="${HUMIO_OPERATOR_VERSION}"

Upgrade Notes

Version 0.13.0

This release bumps the default helper container image, which will cause cluster pods to be recreated when the operator gets upgraded in order to leverage the new helper image tag. If recreation of pods is undesired it is possible to lock the helper image by setting helperImage in the HumioCluster resource to the current version before upgrading the operator. If the helper image tag gets locked, we recommend removing this explicit helper image tag during the next Humio cluster upgrade.

Important highlights:

  • Fixes bug where a HumioView is updated on every reconcile even when it hasn’t changed (credit: Crevil)

  • Fixes multiple bugs where the controller performs unnecessary reconciles resulting in high CPU usage

Version 0.12.0

Important highlights:

  • Adds a startupProbe to the Humio pods

  • Fixes issue where the livenessProbe and readinessProbe on the Humio pods may fail and cause a cascading failure

  • Fixes issue where the operator may become stuck when the Humio cluster does not respond to requests

  • Adds feature to specify secret references for certain fields of HumioAction resources (credit: Crevil)

  • Adds feature to pull the value of a Humio image from a configmap

  • Adds feature to pull the value of Humio pod’s environment variables from a configmap or secret

  • Mounts the Humio pod’s tmp volume under the same container mount that is used for the humio-data directory (applies to Humio versions 1.33.0+)

  • Fixes a number of conflicts where the operator attempts to update old versions of resources it manages

  • Requires Kubernetes 1.19 or higher

  • Requires Cert manager api v1.0

  • Updates cert manager api to use cert-manager.io/v1 instead of cert-manager.io/v1beta1

Version 0.11.0

Important highlights:

  • Fixes a bug where pods may not be created as quickly as they should during an upgrade or restart of Humio.

  • Improved logging

  • Requires Kubernetes 1.16 or higher

  • Requires Cert manager api v0.16 or higher

Version 0.10.2

Version 0.10.2 of the operator no longer works for Kubernetes versions prior to 1.19. This is because the operator now uses the networking/v1 api which does not exist in Kubernetes 1.18 and older.

Important highlights:

  • Updates the default humio version to 1.28.0

  • Uses networking/v1 instead of the deprecated networking/v1beta1

  • Fix bug around installing and validating license when running multiple HumioClusters

Version 0.10.1

Version 0.10.0 was released with the default operator image tag version 0.9.1, while the intention was to use the default image tag of 0.10.0. This release fixes that so the new default image becomes 0.10.1 which includes all the fixes described in the notes for 0.10.0.

Version 0.10.0

This release bumps the default helper container image, which will cause cluster pods to be recreated when the operator gets upgraded in order to leverage the new helper image tag. If recreation of pods is undesired it is possible to lock the helper image by setting helperImage in the HumioCluster resource to the current version before upgrading the operator. If the helper image tag gets locked, we recommend removing this explicit helper image tag during the next Humio cluster upgrade.

Important highlights:

  1. Operator now reuses HTTP connections when possible for communicating with the Humio API

  2. Sidecar now reuses HTTP connections when possible for communicating with the Humio API

Version 0.9.1

No changes, see release notes for version 0.9.0.

Version 0.9.0

This release drops support for Humio versions prior to Humio 1.26.0 and speeds up cluster bootstrapping significantly. With this release Bootstrapping state for HumioCluster CRD’s have been removed entirely, so before upgrading to this release it is important to make sure that no HumioCluster resource is in Bootstrapping state.

This release also bumps the default helper container image, which will cause cluster pods to be recreated when the operator gets upgraded in order to leverage the new helper image tag. If recreation of pods is undesired it is possible to lock the helper image by setting helperImage in the HumioCluster resource to the current version before upgrading the operator. If the helper image tag gets locked, we recommend removing this explicit helper image tag during the next Humio cluster upgrade.

Important highlights:

  1. Drop support for Humio versions prior to 1.26.0.

  2. Drop the use of Bootstrapping state for HumioCluster resources.

  3. Set more detailed release version, commit and date. This version information is logged out during container startup, and is also set as a custom User-Agent HTTP header for requests to the Humio API.

  4. Switch operator container logs to RFC 3339 format with second precision. Humio container logs are unaffected, as this only changes the logs from the operator container.

  5. Bugfix liveness and readiness probes for HumioCluster CRD so it is now possible to set an empty probe. If an empty probe is used the operator will skip configuring the specific probe.

  6. Additional logging for HumioExternalCluster when the API token test fails. Previously it would silently fail and the HumioExternalCluster would be stuck in Unknown state.

  7. Bugfix where license update is triggered even if license was not changed.

  8. Bugfix so Humio storage and digest partition counts are correct when new clusters gets created. Previously clusters would create storage and digest partitions based on Humio’s built-in defaults rather than the user-defined values storagePartitionsCount and digestPartitionsCount in the HumioCluster resource.

Version 0.8.1

This release contains a fix for installing the Humio license during the Bootstrapping state for the HumioCluster CRD.

Version 0.8.0

This release adds support for Humio 1.26.0 and newer. Upgrading to Humio 1.26.0 is not supported with humio-operator versions prior to 0.8.0.

Important highlights:

  1. License is now a required field on the HumioCluster resources. This must be present for both existing clusters and for bootstrapping new clusters.

  2. Default Humio image tag version has been updated to 1.24.3.

Version 0.7.0

This release contains small bugfixes, exposes Humio liveness and readiness probes, and updates operator-sdk and supporting tooling.

Important highlights:

  1. Fixes bug where the operator will try to clean up CA Issuer even when not using cert-manager, resulting in logged warnings;

  2. Allows overriding of Humio liveness and readiness probes;

  3. Fixes a bug where the HumioCluster may get stuck in a ConfigError state even when the cluster is healthy; and

  4. Fixes bug where the operator may panic when the Humio pods are down.

Version 0.6.1

This release fixed a bug where the RBAC rules in the Helm chart have not been updated to include the new CRDs introduced in version 0.6.0.

Version 0.6.0

This release contains new HumioAlerts and HumioActions custom resources. This means these new CRDs must be applied before the upgrade, although it’s recommended to apply CRDs during every upgrade.

Important highlights:

  1. Adds Humio Alerts and Actions support.

  2. Adds the ability to the lookup hostname from a secret.

Version 0.5.1

This release fixes a bug where ingress resources may still be created when spec.hostname and spec.esHostname are not set.

Version 0.5.0

Important highlights:

  1. Upgrading to this release will replace the current HumioCluster pods.

  2. The default json log format for Humio has changed if running Humio version 1.20.1 or later. See internal-logging.

  3. The default Humio version has been updated to 1.20.1.

Version 0.4.0

Important highlights:

  1. Upgrading to this release will replace the current HumioCluster pods.

  2. Fix for bug where UUIDs are not assigned properly when not using USING_EPHEMERAL_DISKS=true. See below for additional information.

  3. Adds support for managing Humio licenses.

  4. Requires explicitly defined storage. See below for additional information.

Additional information:

  • It is now required to explicitly define the storage configuration. This is because until now, the default has been emptyDir, which will result in loss of data if not also using bucket storage. If relying on the default storage configuration, it is now required to set either spec.dataVolumeSource or spec.dataVolumePersistentVolumeClaimSpecTemplate. It is necessary to use either a persistent storage medium or bucket storage to avoid data loss. See the [example resources](/installation/kubernetes/operator/resources) section on how to configure ephemeral or persistent storage.

  • Symptoms of the fixed UUID bug when not using USING_EPHEMERAL_DISKS=true include the appearance of missing nodes and nodes with no partitions assigned in the Cluster Administration page in the Humio UI.

  • Fix for bug where partitions may not be auto balanced by the operator

  • Fix to rolling restart logic to ensure that pods are only restarted one at a time

  • Updates to various operator-managed resources so they now include the ConfigError state

  • Fix bug where restart or update may fail if an existing pod is not in a Running state

  • Change default humio version to 1.18.1

  • Allow for additional labels for ingest token secrets

Version 0.3.0

Important highlights:

  1. Upgrading to this release will replace the current HumioCluster pods.

  2. Add support for Humio 1.19.0. Humio 1.19.0 introduces some changes to how logging is performed which is not taken into account for humio-operator versions prior to this release.

Additional information:

  • New field added to HumioCluster CRD: helperImage

    This field makes it possible to override the default container image used for the helper image. This is useful in scenarios where images should be pulled from a local container image registry.

  • New field added to HumioCluster CRD: disableInitContainer

    The init container is used to extract information about the availability zone from the Kubernetes worker node. If enabled, the auto partition rebalancing will use this to assign digest and storage partitions with availability zones in mind. When running in a single availability zone setup, it could make sense to disable the use of the init container to thighten up the permissions needed to run the pods of a HumioCluster.

  • New field added to HumioCluster CRD: terminationGracePeriodSeconds

    Previously pods were created without an explicit termination grace period for pods. This meant that pods inherit the Kubernetes default behaviour which is 30 seconds. In general Humio should be able to gracefully terminate by itself, and when running with bucket storage and ephemeral nodes the termination should allow time for the Humio node to upload data to bucket storage. The new default value is 300 seconds, but can be overridden by using this field.

  • Bump default Humio version to 1.18.0

    If the Image property on the HumioCluster is left out, this means that the cluster will get upgraded. Make sure to read the release notes for Humio 1.18.0 to confirm this migration is safe to do: https://library.humio.com/stable/notices/releases/stable/#v1.18.0

  • Leverage new suggested partition layouts

    With Humio 1.17.0+ we will now rely on Humio to suggest partition layouts for both digest and storage partitions. The benefit of doing this is that the suggested partition layouts will take into account what availability zone the Humio cluster nodes are located in.

Version 0.2.0

Note that there is a new HumioViews custom resource. This means the HumioViews CRD must be applied before the upgrade (though it is recommended to apply CRDs during every upgrade). There are a number of new features and bug fixes in this release, which are described in the Release Notes.

Version 0.1.2

This release fixes a bug where Humio nodes using persistent storage may receive a NodeExists error when starting up. This applies to Humio clusters using persistent storage, and not clusters using ephemeral disks and bucket storage.

If your cluster is using persistent storage (for example, Persistent Volume Claims), it is important to either omit the environment variable USING_EPHEMERAL_DISKS or set it to false.

If your cluster is using ephemeral disks and bucket storage, it is important to set the environment variable USING_EPHEMERAL_DISKS to true. Note that this is included in the example resources.

This version also upgrades the helper image which is used as the init container and sidecar container for pods tied to a HumioCluster resource. This will be treated as an upgrade procedure, so all pods will be replaced.

Version 0.1.1

No changes required.

Version 0.1.0

No changes required, but it is important to note this version upgrades the helper image which is used as the init container and sidecar container for pods tied to a HumioCluster resource. This will be treated as an upgrade procedure, so all pods will be replaced.

Version 0.0.14

Version 0.0.14 of the Humio Operator contains changes related to how Node UUIDs are set. This fixes an issue where pods may lose their node identity and show as red/missing in the Humio Cluster Administration page under Cluster Nodes when they are scheduled in different availability zones.

When upgrading to version 0.0.14 of the Humio Operator, it is necessary to add the following to the HumioCluster spec to maintain compatibility with previous versions of how the Operator set UUID prefixes:

humio
spec:
  nodeUUIDPrefix: "humio_{{.Zone}}_"

This change must be completed in the following order:

  • Shut down the Humio Operator by deleting it. This can be done by running: kubectl delete deployment humio-operator -n humio-operator

  • Make the above node UUID change to the HumioCluster spec

  • Upgrade the Humio Operator

If creating a fresh Humio cluster, the nodeUUIDPrefix field should be left unset.

Migration to the new node UUID Prefix

The simplest way to migrate to the new UUID prefix is by starting with a fresh HumioCluster. Otherwise, the effect of this change depends on how the HumioCluster is configured.

If using S3 with ephemeral disks, humio nodes will lose their identity when scheduled to new nodes with fresh storage if this change is not made. If you’d like to migrate to the new node UUID prefix, ensure autoRebalancePartitions: false and then perform the upgrade. In the Humio Cluster Administration page under Cluster Nodes, you will notice that old nodes show as red/missing and the new nodes do not have partitions. It is necessary to migrate the storage and digest partitions from the old nodes to the new nodes and then remove the old nodes. You may need to terminate the instance which contains the Humio data one at a time so they generate new UUIDs. Ensure the partitions are migrated before terminating the next instance. Once all old nodes are removed, autoRebalancePartitions can be set back to true if desired.

If using PVCs, it is not strictly nescessary to adjust the nodeUUIDPrefix field as the node UUID is stored in the PVC. If the PVC is bound to a zone (such as with AWS), then this is not an issue. If the PVC is not bound to a zone, then you may still have the issue where nodes lose their pod identity when scheduled in different availability zones. If this is the case, nodes must be manually removed from the Humio Cluster Administration page under Cluster Nodes, while taking care to first migrate storage and digest partitions away from the node prior to removing it from the cluster.

Version 0.0.13

There are no special tasks required during this upgrade, however, it is worth noting that the operator-sdk version was changed in version 0.0.13 so it is important that the helm version matches the operator version otherwise the Humio pods will fail to start due to a missing /manager entrypoint.

Version 0.0.12

The selector labels changed in version 0.0.12, so for this reason it is necessary to delete the humio-operator deployment prior to upgrading the helm chart. The upgrade steps are:

  1. Delete the humio-operator deployment by running: kubectl delete deployment humio-operator -n humio-operator

  2. Run the helm upgrade command as documented above

If the humio-operator deployment is not removed before the ugprade, the upgrade will fail with:

humio
Error: UPGRADE FAILED: cannot patch "humio-operator" with kind Deployment: Deployment.apps "humio-operator" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"humio-operator", "app.kubernetes.io/instance":"humio-operator", "app.kubernetes.io/name":"humio-operator"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

Pre-0.0.12

No special changes are necessary when upgrading the Humio Operator between versions 0.0.0-0.0.11.