LogScale Collector Helm Chart

A Helm chart is a package that contains all the necessary resources to deploy an application to a Kubernetes cluster. This includes YAML configuration files for deployments, services, secrets, and configurations maps that define the desired state of your application.

Helm uses a packaging format called charts. A chart is a collection of files that describe a related set of Kubernetes resources.

Directions for installing Helm for the specific OS can be found here: https://github.com/helm/helm page.

Note

If you are using Helm Chart with Falcon CWP see Helm Chart with Falcon CWP (Cloud Workload Protection) for details on the issues which are displayed.

Using Helm Chart to Ship Kubernetes Logs

This guide describes how to send Application logs from a Kubernetes cluster to LogScale Cluster.

  1. Create a Kubernetes secret that contains an ingest token for LogScale repository which will ingest the Pod logs.

    kubectl create secret generic logscale-collector-token --from-literal=ingestToken="YOUR INGEST TOKEN HERE"
  2. Create a file named logscale-collector.yaml with the following content, and substitute the URL for the URL of the LogScale cluster.

    # image: <registry path>/logscale-collector:x.y.z LogScale Collector version    
    # Uncomment to use the publicly available docker image provided by CrowdStrike. (TBD)
    # imagePullPolicy: Always
    # imagePullSecrets: []
     
    humioAddress: https://<;your logscale cluster URL>
     
    humioIngestTokenSecretName: "logscale-collector-token" # kubernetes secret containing ingest token to receive pod logs.
    humioIngestTokenSecretKey: "ingestToken" # key inside above kubernetes secret
     
    #humioDebugTokenSecretName: "" # Optional: kubernetes secret containing ingest token to receive LogScale Collector debug logs.
    #humioDebugTokenSecretKey: "" # key inside above kubernetes secret.
    #logLevel: "warn" # LogScale Collector stderr log level. Valid values: [trace (highest verbosity), debug, info, warn, error (default), fatal]
    
    #queueMemoryLimitMB: "8" # Uncomment to override the default memory queue size.
    #queueFlushTimeoutMillis: "100" # Uncomment to override the default memory queue flush timeout.
     
    #collectJournal: true # Uncomment to also collect Journald logs from the kubernetes nodes.
    #additionalSinkOptions: {} # Uncomment if it's necessary to include additional sink options, such as tls, proxy or similar.
  3. Run the install command below (requires a helm repository with access to our Helm chart), and substitute my-install-name for a preferred name, or use the --generate-name argument.

    helm repo add logscale-collector-helm https://registry.crowdstrike.com/log-collector-us1-prod 
    helm install my-install-name logscale-collector-helm/logscale-collector --values logscale-collector.yaml
What is included in the Helm release?

This section describes the details of what is accomplished with the Helm chart.

Part of the Helm chart package is a file called config.yaml file. This file defines the configuration of the LogScale Collector.

sources:
  containers:
    type: file
      include: /var/log/containers/*.log
      exclude: /var/log/containers/{{ .Release.Name }}-logscale-collector-*.log
      transforms:
        - type: kubernetes
      sink: humio
      {{- if .Values.collectJournal }}
      journal:
        type: journald
        sink: humio
      {{- end }}

Two sources are defined:

  1. A file source named containers, for application logs from all application pods.

  2. A journald source for journald logs from the node.

The containers source includes /var/log/containers/*.log files (these are the files with the application logs from the application pods running in the node), and excludes the debug log from the LogScale Collector pod.

This source has a transform associated of type kubernetes. This transform will enrich application logs with kubernetes metadata associated with the pod.

The filenames in /var/log/containers are parsed to obtain information required to query the Kubernetes API for metadata which will be used to enrich the data with the following fields:

  • kubernetes.image

  • kubernetes.image_hash

  • kubernetes.container_name

  • kubernetes.container_id

  • kubernetes.node_name

  • kubernetes.namespace

  • kubernetes.pod_name

  • kubernetes.pod_id

  • A list of labels:

    • kubernetes.labels.<label_1>

    • kubernetes.labels.<label_n>

  • A list of annotations:

    • kubernetes.annotations.<annotation_1>

    • kubernetes.annotations.<annotation_n>

The connection to fetching the metadata is made based on the following assumptions:

A token & certificate can be fetched using these paths:

  • Token file: /var/run/secrets/kubernetes.io/serviceaccount/token

  • Trusted root certificate: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

And we rely on the following environment variables:

  • Kubernetes API hostname: KUBERNETES_SERVICE_HOST

  • Kubernetes API port: KUBERNETES_SERVICE_PORT_HTTPS

The fetched metadata is cached for performance.

The transform fetches pod name, namespace and container id from the log file name, i.e:

kube-apiserver-docker-desktop_kube-system_kube-apiserver-479ebd6ee0267ba8b215e90d389517be2f4896cb42340e5ba05f357e213ffed2.log

Using the regex:

^([^_]+)_([^_]+)_(.+)-([A-Fa-f0-9]+)\.log$

To produce the following:

Pod Name: kube-apiserver-docker-desktop
        Namespace: kube-system
        Container name: kube-apiserver
        Container ID: 479ebd6ee0267ba8b215e90d389517be2f4896cb42340e5ba05f357e213ffed2