Migrating from Helm Chart to Operator

This guide describes how to migrate from an existing cluster running the Humio Helm Chart to the Humio Operator and HumioCluster custom resource.
Pre-Requisites
Identify Method of Deployment
There are two different approaches to migration depending on how the existing helm chart is deployed.
Using ephemeral nodes with bucket storage
Using PVCs
By default, the original helm chart uses PVCs. If the existing
chart is deployed with the environment variable
S3_STORAGE_BUCKET
, then it is using ephemeral nodes
with bucket storage.
Migrate Kafka & Zookeeper
The Humio Operator does not run Kafka and Zookeeper built-in alongside Humio as the Humio Helm Charts do. In order to migrate to the Operator, Humio must point to a Kafka and Zookeeper that is not managed by Humio. There are a number of Open Source Operators for running Kafka and Zookeeper, for example:
If you're running on AWS, then MSK is recommended for ease of use: MSK
It is necessary to perform the Kafka and Zookeeper migration before continuing with the migration to the operator. This can be done by taking these steps:
Start up Kafka and Zookeeper (not managed by the operator)
Shut down Humio nodes
Reconfigure the values.yaml to use the new Kafka and Zookeeper connection. For example:
humio-core:
external:
kafkaBrokers: 192.168.0.10:9092,192.168.1.10:9092,192.168.2.10:9092
zookeeperServers: 192.168.0.20:2181,192.168.1.20:2181,192.168.2.20:2181
Start Humio back up
Migrating Using Ephemeral Nodes & Bucket Storage
When migrating to the Operator using ephemeral nodes and bucket
storage, first install the Operator but bring down the existing
Humio pods prior to creating the
HumioCluster
. Configure the
new HumioCluster
to use the
same kafka and zookeeper servers as the existing cluster. The
Operator will create pods that assume the identity of the existing
nodes and will pull data from bucket storage as needed.
Install the Operator according to the Operator Installation Guide.
Bring down existing pods by changing the
replicas
of the Humio stateful set to0
.Create a
HumioCluster
by referring to the HumioCluster Resource. Ensure that this resource is configured the same as the existing chart's values.yaml file. See special considerations. Ensure that TLS is disabled for theHumioCluster
, see TLS. Ensure thatautoRebalancePartitions
is set tofalse
(default).Validate that the new Humio pods are running with the existing node identities and they show up in the Cluster Administration page of the Humio UI.
Follow either Ingress Migration or Service Migration depending on whether you are using services or ingress to access the Humio cluster.
Modify the Humio Helm Chart values.yaml so that it no longer manages Humio. If using fluentbit, ensure es autodiscovery is turned off:
humio-core:
enabled: false
humio-fluentbit:
es:
autodiscovery: false
And then run: helm upgrade --values
values.yaml humio humio/humio-helm-charts
. This will
continue to keep fluentbit and/or metricbeat if they are enabled. If
you do not wish to keep fluentbit and/or metricbeat or they are not
enabled, you can uninstall the Humio Helm chart by running
helm delete --purge humio
where humio
is the name of the
original Helm Chart. Be cautious to delete the original Helm Chart
and not the Helm Chart used to install the Operator.
Enable TLS
Migrating Using PVCs
When migrating to the Operator using PVCs, install the Operator
while the existing cluster is running and configure the new
HumioCluster
to use the same
kafka and zookeeper servers as the existing cluster. The Operator
will create new nodes as part of the existing cluster. From there,
change the partition layout such that they are assigned to only the
new nodes, and then we can uninstall the old helm chart.
Install the Operator according to the Operator Installation Guide.
Create a
HumioCluster
by referring to the HumioCluster Resource. Ensure that this resource is configured the same as the existing chart's values.yaml file. See special considerations. Ensure that TLS is disabled for theHumioCluster
, see TLS. Ensure thatautoRebalancePartitions
is set tofalse
(default).Validate that the new Humio pods are running and show up in the Cluster Administration page of the Humio UI.
Manually migrate digest partitions from the old pods created by the Helm Chart to the new pods created by the Operator.
Manually migrate storage partitions from the old pods created by the Helm Chart to the new pods created by the Operator. After the partitions have been re-assigned, for each of the new nodes, click
Show Options
and thenStart Transfers
. This will begin the migration of data.Wait until all new nodes contain all the data and the old nodes contain no data.
Follow either Ingress Migration or Service Migration depending on whether you are using services or ingress to access the Humio cluster.
Modify the Humio Helm Chart values.yaml so that it no longer manages Humio. If using fluentbit, ensure es autodiscovery is turned off:
humio-core:
enabled: false
humio-fluentbit:
es:
autodiscovery: false
And then run: helm upgrade --values
values.yaml humio humio/humio-helm-charts
. This will
continue to keep fluentbit and/or metricbeat if they are enabled. If
you do not wish to keep fluentbit and/or metricbeat or they are not
enabled, you can uninstall the Humio Helm chart by running
helm delete --purge humio
where humio
is the name of the
original Helm Chart. Be cautious to delete the original Helm Chart
and not the Helm Chart used to install the Operator.
Enable TLS.
Service Migration
This section is only applicable if the method of accessing the cluster is via the service resources. If you are using ingress, refer to the Ingress Migration.
This section is only applicable if the method of accessing the cluster is via the service resources. If you are using ingress, refer to the Ingress Migration.
The Humio Helm Chart manages three services: the
http
service, the
es
service and a
headless
service which is
required by the statefulset. All of these services will be replaced
by a single service which is named with the name of the
HumioCluster
.
After migrating the pods, it will no longer be possible to access
the cluster using any of the old services. Ensure that the new
service in the HumioCluster
is
exposed the same way (e.g., type:
LoadBalancer
) and then begin using the new service to
access the cluster.
Ingress Migration
This section is only applicable if the method of accessing the cluster is via the ingress resources. If you are using services, refer to the Service Migration.
When migrating using ingress, be sure to enable and configure the
HumioCluster
ingress using the
same hostnames that the Helm Chart uses. See
Ingress.
As long as the ingress resources use the same ingress controller,
they should migrate seamlessly as DNS will resolve to the same nginx
controller. The ingress resources managed by the Helm Chart will be
deleted when the Helm Chart is removed or when
humio-core.enabled
is set to
false in the values.yaml.
If you wish to use the same certificates that were generated for the
old ingress resource for the new ingresses, you must copy the old
secrets to the new name format of
<cluster
name>-certificate
and
<cluster
name>-es-certificate
. It is possible to use a custom
secret name for the certificates by setting
spec.ingress.secretName
and
spec.ingress.esSecretName
on
the HumioCluster
resource,
however you cannot simply set this to point to the existing secrets
as they are managed by the Helm Chart and will be deleted when the
Helm Chart is removed or when
humio-core.enabled
is set to
false in the values.yaml.
Special Considerations
There are many situations that when migrating from the
Humio Helm
Chart to the Operator where the configuration does not
transfer directly from the values.yaml to the
HumioCluster
resource. This
section lists some common configurations with the original Helm
Chart values.yaml and the replacement
HumioCluster
spec
configuration. Only the relevant parts of the configuration are
present starting from the top-level key for the subset of the
resource.
It is not necessary to migrate every one of the listed configurations, but instead use these as a reference on how to migrate only the configurations that are relevant to your cluster.
TLS
The Humio Helm Chart supports TLS for Kafka communication but does not support TLS for Humio-to-Humio communication. This section refers to Humio-to-Humio TLS. For Kafka, see extra kafka configuration.
By default, TLS is enabled when creating a
HumioCluster
resource. This
is recommended, however, when performing a migration from the Helm
Chart, TLS should be disabled and then after the migration is
complete TLS can be enabled.
Humio Helm Chart
Not supported
HumioCluster
spec:
tls:
enabled: false
Host Path
The Operator creates Humio pods with a stricter security context
than the Humio Helm Charts. To support this stricter context, it
is necessary for the permissions of the
hostPath.path
(i.e. the path
on the kubernetes node that is mounted into the Humio pods) has a
group owner of the nobody
user which is user id 65534
.
Humio Helm Chart
humio-core:
primaryStorage:
type: hostPath
hostPath:
path: /mnt/disks/vol1
type: Directory
HumioCluster
spec:
dataVolumeSource:
hostPath:
path: /mnt/disks/vol1
type: Directory
Persistent Volumes
By default, the Helm Chart uses persistent volumes for storage of the Humio data volume. This changed in the Operator, where it is now required to define the storage medium.
Humio Helm Chart
humio-core:
storageVolume:
size: 50Gi
HumioCluster
spec:
dataVolumePersistentVolumeClaimSpecTemplate:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 50Gi
Custom Storage Class for Persistent Volumes
Humio Helm Chart
Create a storage class:
humio-core:
storageClass:
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
Use a custom storage class:
humio-core:
storageClassName: custom-storage-class-name
HumioCluster
Creating a storage class is no longer supported. First, create your storage class by following the offical docs and then use the following configuration to use it.
spec:
dataVolumePersistentVolumeClaimSpecTemplate:
storageClassName: my-storage-class
Pod Resources
Humio Helm Chart
humio-core:
resources:
limits:
cpu: "4"
memory: 6Gi
requests:
cpu: 2
memory: 4Gi
HumioCluster
spec:
resources:
limits:
cpu: "4"
memory: 6Gi
requests:
cpu: 2
memory: 4Gi
JVM Settings
Humio Helm Chart
jvm:
xss: 2m
xms: 256m
xmx: 1536m
maxDirectMemorySize: 1536m
extraArgs: "-XX:+UseParallelGC"
Pod Anti-Affinity
It is highly recommended to have anti-affinity policies in place
and required for when using
hostPath
for storage.
Note that the Humio pod labels are different between the Helm
Chart and operator. In the Helm Chart, the pod label that is used
for anti-affinity is
app=humio-core
, while the
operator is
app.kubernetes.io/name=humio
.
If migrating PVCs, it is important to ensure that the new pods
created by the operator are not scheduled on the nodes that run
the old pods created by the Humio Helm Chart. To do this, ensure
there is a matchExpressions
with DoesNotExist
on the
app
key. See below for the
example.
Humio Helm Chart
humio-core:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- humio-core
topologyKey: kubernetes.io/hostname
HumioCluster
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- humio
- key: app
operator: DoesNotExist
topologyKey: kubernetes.io/hostname
Service Type
Humio Helm Chart
humio-core:
service:
type: LoadBalancer
HumioCluster
spec:
humioServiceType: LoadBalancer
Ingress
Humio Helm Chart
humio-core:
ingress:
enabled: true
config:
- name: general
annotations:
certmanager.k8s.io/acme-challenge-type: http01
certmanager.k8s.io/cluster-issuer: letsencrypt-prod
kubernetes.io/ingress.class: nginx
kubernetes.io/tls-acme: "true"
hosts:
- host: my-cluster.example.com
paths:
- /
tls:
- secretName: my-cluster-crt
hosts:
- my-cluster.example.com
- name: ingest-es
annotations:
certmanager.k8s.io/acme-challenge-type: http01
cert-manager.io/cluster-issuer: letsencrypt-prod
kubernetes.io/ingress.class: nginx
kubernetes.io/tls-acme: "true"
rules:
- host: my-cluster-es.humio.com
http:
paths:
- path: /
backend:
serviceName: humio-humio-core-es
servicePort: 9200
tls:
- secretName: my-cluster-es-crt
hosts:
- my-cluster-es.humio.com
...
HumioCluster
spec:
hostname: "my-cluster.example.com"
esHostname: "my-cluster-es.example.com"
ingress:
enabled: true
controller: nginx
# optional secret names. do not set these to the secrets created by the helm chart as they will be deleted when the
# helm chart is removed
# secretName: my-cluster-certificate
# esSecretName: my-cluster-es-certificate
annotations:
use-http01-solver: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod
kubernetes.io/ingress.class: nginx
Bucket Storage GCP
Humio Helm Chart
humio-core:
bucketStorage:
backend: gcp
env:
- name: GCP_STORAGE_BUCKET
value: "example-cluster-storage"
- name: GCP_STORAGE_ENCRYPTION_KEY
value: "example-random-encryption-string"
- name: LOCAL_STORAGE_PERCENTAGE
value: "80"
- name: LOCAL_STORAGE_MIN_AGE_DAYS
value: "7"
HumioCluster
spec:
extraHumioVolumeMounts:
- name: gcp-storage-account-json-file
mountPath: /var/lib/humio/gcp-storage-account-json-file
subPath: gcp-storage-account-json-file
readOnly: true
extraVolumes:
- name: gcp-storage-account-json-file
secret:
secretName: gcp-storage-account-json-file
environmentVariables:
- name: GCP_STORAGE_ACCOUNT_JSON_FILE
value: "/var/lib/humio/gcp-storage-account-json-file"
- name: GCP_STORAGE_BUCKET
value: "my-cluster-storage"
- name: GCP_STORAGE_ENCRYPTION_KEY
value: "my-encryption-key"
- name: LOCAL_STORAGE_PERCENTAGE
value: "80"
- name: LOCAL_STORAGE_MIN_AGE_DAYS
value: "7"
Bucket Storage S3
The S3 bucket storage configuration is the same, with the exception to how the enivronment variables are set.
Humio Helm Chart
humio-core:
env:
- name: S3_STORAGE_BUCKET
value: "example-cluster-storage"
- name: S3_STORAGE_REGION
value: "us-west-2"
- name: S3_STORAGE_ENCRYPTION_KEY
value: "example-random-encryption-string"
- name: LOCAL_STORAGE_PERCENTAGE
value: "80"
- name: LOCAL_STORAGE_MIN_AGE_DAYS
value: "7"
- name: S3_STORAGE_PREFERRED_COPY_SOURCE
value: "true"
HumioCluster
spec:
environmentVariables:
- name: S3_STORAGE_BUCKET
value: "example-cluster-storage"
- name: S3_STORAGE_REGION
value: "us-west-2"
- name: S3_STORAGE_ENCRYPTION_KEY
value: "example-random-encryption-string"
- name: LOCAL_STORAGE_PERCENTAGE
value: "80"
- name: LOCAL_STORAGE_MIN_AGE_DAYS
value: "7"
- name: S3_STORAGE_PREFERRED_COPY_SOURCE
value: "true"
Ephemeral Nodes and Cluster Identity
There are three main parts to using ephemeral nodes: setting the
USING_EPHEMERAL_DISKS
environment variable,
selecting Zookeeper cluster identity and setting
AWS
Bucket Storage or
Google
Cloud Bucket Storage (described in the separate linked
section). In the Helm Chart, zookeeper identity is explicitly
configured, but the operator now defaults to using zookeeper for
identity regardless of the ephemeral disks setting.
Humio Helm Chart
humio-core:
clusterIdentity:
type: zookeeper
env:
- name: ZOOKEEPER_URL_FOR_NODE_UUID
value: "$(ZOOKEEPER_URL)"
- name: USING_EPHEMERAL_DISKS
value: "true"
HumioCluster
spec:
environmentVariables:
- name: USING_EPHEMERAL_DISKS
value: "true"
Cache Configuration
Cache configuration is no longer supported in the Humio operator. It's recommended to use ephemeral nodes and bucket storage instead.
Humio Helm Chart
humio-core:
cache:
localVolume:
enabled: true
HumioCluster
Not supported
Authentication - OAuth Google
Humio Helm Chart
humio-core:
authenticationMethod: oauth
oauthConfig:
autoCreateUserOnSuccessfulLogin: true
publicUrl: https://my-cluster.example.com
env:
- name: GOOGLE_OAUTH_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: humio-google-oauth-secret
key: supersecretkey
- name: GOOGLE_OAUTH_CLIENT_ID
value: YOURCLIENTID
HumioCluster
spec:
environmentVariables:
- name: AUTHENTICATION_METHOD
value: oauth
- name: AUTO_CREATE_USER_ON_SUCCESSFUL_LOGIN
value: "true"
- name: PUBLIC_URL
value: https://my-cluster.example.com
- name: GOOGLE_OAUTH_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: humio-google-oauth-secret
key: supersecretkey
- name: GOOGLE_OAUTH_CLIENT_ID
value: YOURCLIENTID
Authentication - OAuth Github
Humio Helm Chart
humio-core:
authenticationMethod: oauth
env:
- name: PUBLIC_URL
value: https://my-cluster.example.com
- name: GITHUB_OAUTH_CLIENT_ID
value: client-id-from-github-oauth
- name: GITHUB_OAUTH_CLIENT_SECRET
value: client-secret-from-github-oauth
HumioCluster
spec:
environmentVariables:
- name: AUTHENTICATION_METHOD
value: oauth
- name: AUTO_CREATE_USER_ON_SUCCESSFUL_LOGIN
value: "true"
- name: PUBLIC_URL
value: https://my-cluster.example.com
- name: GITHUB_OAUTH_CLIENT_ID
value: client-id-from-github-oauth
- name: GITHUB_OAUTH_CLIENT_SECRET
value: client-secret-from-github-oauth
Authentication - OAuth BitBucket
Humio Helm Chart
humio-core:
authenticationMethod: oauth
env:
- name: PUBLIC_URL
value: https://my-cluster.example.com
- name: BITBUCKET_OAUTH_CLIENT_ID
value: client-id-from-bitbucket-oauth
- name: BITBUCKET_OAUTH_CLIENT_SECRET
value: client-secret-from-bitbucket-oauth
HumioCluster
spec:
environmentVariables:
- name: AUTHENTICATION_METHOD
value: oauth
- name: AUTO_CREATE_USER_ON_SUCCESSFUL_LOGIN
value: "true"
- name: BITBUCKET_OAUTH_CLIENT_ID
value: client-id-from-bitbucket-oauth
- name: BITBUCKET_OAUTH_CLIENT_SECRET
value: client-secret-from-bitbucket-oauth
Authentication - SAML
When using SAML, it's necessary to follow the
SAML instructions and once
the IDP certificate is obtained, you must create a secret
containing that certificate using kubectl. The secret name is
slightly different in the
HumioCluster
vs the Helm
Chart as the HumioCluster
secret must be prefixed with the cluster name.
Creating the secret:
Helm Chart:
kubectl create secret generic idp-certificate --from-file=idp-certificate=./my-idp-certificate.pem -n <namespace>
HumioCluster:
kubectl create secret generic <cluster-name>-idp-certificate --from-file=idp-certificate.pem=./my-idp-certificate.pem -n <namespace>
Humio Helm Chart
humio-core:
authenticationMethod: saml
samlConfig:
publicUrl: https://my-cluster.example.com
idpSignOnUrl: https://accounts.google.com/o/saml2/idp?idpid=idptoken
idpEntityId: https://accounts.google.com/o/saml2/idp?idpid=idptoken
env:
- name: GOOGLE_OAUTH_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: humio-google-oauth-secret
key: supersecretkey
- name: GOOGLE_OAUTH_CLIENT_ID
value: YOURCLIENTID
HumioCluster
spec:
environmentVariables:
- name: AUTHENTICATION_METHOD
value: saml
- name: AUTO_CREATE_USER_ON_SUCCESSFUL_LOGIN
value: "true"
- name: PUBLIC_URL
value: https://my-cluster.example.com
- name: SAML_IDP_SIGN_ON_URL
value: https://accounts.google.com/o/saml2/idp?idpid=idptoken
- name: SAML_IDP_ENTITY_ID
value: https://accounts.google.com/o/saml2/idp?idpid=idptoken
Authentication - By Proxy
Humio Helm Chart
humio-core:
authenticationMethod: byproxy
authByProxyConfig:
headerName: name-of-http-header
HumioCluster
spec:
environmentVariables:
- name: AUTHENTICATION_METHOD
value: byproxy
- name: AUTH_BY_PROXY_HEADER_NAME
value: name-of-http-header
Authentication - Single User
The Helm Chart generated a password for developer user when using
single-user mode. The operator does not do this so you must supply
your own password. This can be done via a plain text environment
variable or using a kuberenetes secret that is referenced by an
environment variable. If supplying a secret, you must populate
this secret prior to creating the
HumioCluster
resource
otherwise the pods will fail to start.
Humio Helm Chart
humio-core:
authenticationMethod: single-user
HumioCluster
Note that the AUTHENTICATION_METHOD
defaults to
single-user
.
By setting a password using an environment variable plain text value:
spec:
environmentVariables:
- name: "SINGLE_USER_PASSWORD"
value: "MyVeryS3cretPassword"
By setting a password using an environment variable secret reference:
spec:
environmentVariables:
- name: "SINGLE_USER_PASSWORD"
valueFrom:
secretKeyRef:
name: developer-user-password
key: password
Extra Kafka Configs
Humio Helm Chart
humio-core:
extraKafkaConfigs: "security.protocol=SSL"
HumioCluster
spec:
extraKafkaConfigs: "security.protocol=SSL"
Prometheus
The Humio Helm chart supported setting the
prometheus.io/port
and
prometheus.io/scrape
annotations on the Humio pods. The Operator no longer supports
this.
Humio Helm Chart
humio-core:
prometheus:
enabled: true
HumioCluster
Not supported
Pod Security Context
Humio Helm Chart
humio-core:
podSecurityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
HumioCluster
spec:
podSecurityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
Container Security Context
Humio Helm Chart
humio-core:
containerSecurityContext:
capabilities:
add: ["SYS_NICE"]
HumioCluster
spec:
containerSecurityContext:
capabilities:
add: ["SYS_NICE"]
Initial Partitions
The Helm Chart accepted both
ingest.initialPartitionsPerNode
and
storage.initialPartitionsPerNode
.
The Operator no longer supports the per-node setting, so it's up
to the administrator to set the initial partitions such that
they are divisible by the node count.
Humio Helm Chart
humio-core:
ingest:
initialPartitionsPerNode: 4
storage:
initialPartitionsPerNode: 4
HumioCluster
Assuming a three node cluster:
spec:
environmentVariables:
- name: "INGEST_QUEUE_INITIAL_PARTITIONS"
value: "12"
- name: "DEFAULT_PARTITION_COUNT"
value: "12"
Log Storage
The Helm Chart supports the use of separate storage for logs. This
is not supported in the Operator and instead defaults to running
Humio with the environment variable
LOG4J_CONFIGURATION=log4j2-json-stdout.xml
which outputs to stdout in json format.
Humio Helm Chart
humio-core:
jvm:
xss: 2m
xms: 256m
xmx: 1536m
maxDirectMemorySize: 1536m
extraArgs: "-XX:+UseParallelGC"
HumioCluster
Not supported