Advanced Architecture Configuration

We can split out responsibilities for cluster nodes, such that each responsibility has their own dedicated set of cluster pods. This makes it possible to define update strategies for sets of cluster pods that serve a single purpose, or affinity rules that schedules the pods on a specific set of Kubernetes worker nodes.

Pros:

  • Easy to scale individual logical components independently

  • Cost benefits of scaling individual logical components independently, e.g. scaling nodes handling only ingest on cheap Kubernetes workers.

Cons:

  • Requires additional configuration

yaml
apiVersion: core.humio.com/v1alpha1
kind: HumioCluster
metadata:
  name: advanced-cluster-1
  namespace: example-clusters
spec:
  license:
    secretKeyRef:
      name: advanced-cluster-1-license
      key: data
  targetReplicationFactor: 2
  storagePartitionsCount: 720
  digestPartitionsCount: 720
  image: "humio/humio-core:1.82.0"
  nodeCount: 3
  resources:
    limits:
      memory: 128Gi
      cpu: 31
    requests:
      memory: 128Gi
      cpu: 31
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes_worker_node_group
            operator: In
            values:
            - humio-workers
          - key: kubernetes.io/arch
            operator: In
            values:
            - amd64
          - key: kubernetes.io/os
            operator: In
            values:
            - linux
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
            - humio
        topologyKey: kubernetes.io/hostname
  dataVolumeSource:
    hostPath:
      path: "/mnt/disks/vol1"
      type: "Directory"
  humioServiceAccountAnnotations:
    eks.amazonaws.com/role-arn: "arn:aws:iam::111111111111:role/SomeRoleWithDesiredS3Access"
  environmentVariables:
    - name: QUERY_COORDINATOR
      value: "false"
    - name: S3_STORAGE_BUCKET
      value: "advanced-cluster-1-storage"
    - name: S3_STORAGE_REGION
      value: "us-west-2"
    - name: S3_STORAGE_ENCRYPTION_KEY
      valueFrom:
        secretKeyRef:
          name: advanced-cluster-1-bucket-storage
          key: encryption-key
    - name: USING_EPHEMERAL_DISKS
      value: "true"
    - name: S3_STORAGE_PREFERRED_COPY_SOURCE
      value: "true"
    - name: KAFKA_SERVERS
      value: "kafka-advanced-cluster-1-bootstrap:9092"
    - name: AUTHENTICATION_METHOD
      value: saml
    - name: AUTO_CREATE_USER_ON_SUCCESSFUL_LOGIN
      value: "true"
    - name: PUBLIC_URL
      value: https://advanced-cluster-1.logscale.local
    - name: SAML_IDP_SIGN_ON_URL
      value: https://accounts.google.com/o/saml2/idp?idpid=idptoken
    - name: SAML_IDP_ENTITY_ID
      value: https://accounts.google.com/o/saml2/idp?idpid=idptoken
    - name: INGEST_QUEUE_REPLICATION_FACTOR
      value: "3"
  nodePools:
    - name: "httponly"
      spec:
        image: "humio/humio-core:1.82.0"
        nodeCount: 3
        resources:
          limits:
            memory: 64Gi
            cpu: 15
          requests:
            memory: 64Gi
            cpu: 15
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
               nodeSelectorTerms:
               - matchExpressions:
                 - key: kubernetes_worker_node_group
                   operator: In
                   values:
                   - humio-workers-httponly
                 - key: kubernetes.io/arch
                   operator: In
                   values:
                   - amd64
                 - key: kubernetes.io/os
                   operator: In
                   values:
                   - linux
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                - key: app.kubernetes.io/name
                   operator: In
                   values:
                   - humio
              topologyKey: kubernetes.io/hostname
      dataVolumeSource:
        hostPath:           
          path: "/mnt/disks/vol1"
          type: "Directory"
      humioServiceAccountAnnotations:
        eks.amazonaws.com/role-arn: "arn:aws:iam::111111111111:role/SomeRoleWithDesiredS3Access"
      environmentVariables:
        - name: NODE_ROLES
          value: "httponly"
        - name: S3_STORAGE_BUCKET
          value: "advanced-cluster-1-storage"
        - name: S3_STORAGE_REGION
          value: "us-west-2"
        - name: S3_STORAGE_ENCRYPTION_KEY
          valueFrom:
            secretKeyRef:
              name: advanced-cluster-1-bucket-storage
              key: encryption-key
        - name: USING_EPHEMERAL_DISKS
          value: "true"
        - name: S3_STORAGE_PREFERRED_COPY_SOURCE
          value: "true"
        - name: KAFKA_SERVERS
          value: "kafka-advanced-cluster-1-bootstrap:9092"
        - name: AUTHENTICATION_METHOD
          value: saml
        - name: AUTO_CREATE_USER_ON_SUCCESSFUL_LOGIN
          value: "true"
        - name: PUBLIC_URL
          value: https://advanced-cluster-1.logscale.local
        - name: SAML_IDP_SIGN_ON_URL
          value: https://accounts.google.com/o/saml2/idp?idpid=idptoken
        - name: SAML_IDP_ENTITY_ID
          value: https://accounts.google.com/o/saml2/idp?idpid=idptoken
        - name: INGEST_QUEUE_REPLICATION_FACTOR
          value: "3"
Kubernetes Deployment Advanced Cluster Definition

Figure 8. Kubernetes Deployment Advanced Cluster Definition