Collecting Kubernetes Pod Logs

Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services.

Kubernetes runs your workload by placing containers into pods to run on nodes. A node may be a virtual or physical machine, depending on the cluster.

  • A node is a worker machine in Kubernetes and may be either a virtual or a physical machine, depending on the cluster.

  • A pod is a Kubernetes abstraction that represents a group of one or more application containers (such as Docker), and some shared resources for those containers. A Pod models an application-specific "logical host" and can contain different application containers which are relatively tightly coupled.

  • A container image is a ready-to-run software package containing everything needed to run an application: the code and any runtime it requires, application and system libraries, and default values for any essential settings.

The following describes:

Deploying LogScale Collector for Log Forwarding

When it comes to managing micro-services in a Kubernetes cluster, LogScale is a great way to get insights into your applications.

The LogScale Collector can be deployed in a Kubernetes to forwarding log messages from the applications deployed in the cluster.

In case of e.g. an application crashing on a virtual machine, the logs from the application are still available until deleted. In Kubernetes, when pods crash, are deleted or scheduled on a new node, the logs from the application containers are lost. For this reason if you want insight into e.g. why a crash occurred, you need the logs forwarded to a centralized log management solution like e.g. LogScale.

Several different deployments are possible, but the model below describes node-level logging using the DaemonSet model. (This is delivered as an out-of-the-box solution consisting of a LogScale Collector Helm chart and container image) DaemonSet approach in which a node-level LogScale Collector runs on every node, and handles logging for all application containers in pods on the node.

Node level logging overview

Figure 42. Node Level Logging


Using Node Level Logging

In this scenario the LogScale Collector is deployed as a DaemonSet on a Kubernetes node to ingest logs from applications running in pods on that node.

The LogScale collector is deployed as a Kubernetes DaemonSet, which is a Kubernetes feature that lets you run a Kubernetes pod on all cluster nodes that meet certain criteria. Every time a new node is added to a cluster, the pod is added to it, and when a node is removed from the cluster, the pod is removed.

Node-level logging creates one LogScale Collector per node and does not require any changes to the applications running on the node.

Containers write to stdout and stderr, but with no agreed format. A node-level LogScale Collector collects these logs and forwards them in realtime to LogScale for live analysis and storage/future analysis.

This is accomplished by running the LogScale Collector in a container that has access to a directory with log files from all of the application containers in all pods on that node.

CrowdStrike provides a LogScale Collector Helm chart for deploying the Logscale Collector in Kubernetes as a DaemonSet, collecting logs from pods.

Helm

A Helm chart is a package that contains all the necessary resources to deploy an application to a Kubernetes cluster. This includes YAML configuration files for deployments, services, secrets, and configurations maps that define the desired state of your application.

Helm uses a packaging format called charts. A chart is a collection of files that describe a related set of Kubernetes resources.

Directions for installing Helm for your particular OS can be found here: https://github.com/helm/helm page.

Note

If you are using Helm Chart with Falcon CWP see Helm Chart with Falcon CWP (Cloud Workload Protection) for details on the issues which are displayed.

Using Helm Chart to Ship Kubernetes Logs

This guide describes how to send Application logs from your Kubernetes cluster to your LogScale Cluster.

  1. Create a Kubernetes secret that contains an ingest token to a LogScale repository that should receive the Pod logs.

    kubectl create secret generic logscale-collector-token --from-literal=ingestToken="YOUR INGEST TOKEN HERE"
  2. Create a file named logscale-collector.yaml with the following content, where you must substitute the URL for the LogScale cluster.

    # image: <registry path>/logscale-collector:x.y.z LogScale Collector version    
    # Uncomment if you do not want to use the publicly available docker image provided by CrowdStrike. (TBD)
    # imagePullPolicy: Always
    # imagePullSecrets: []
     
    humioAddress: https://<;your logscale cluster URL>
     
    humioIngestTokenSecretName: "logscale-collector-token" # kubernetes secret containing ingest token to receive pod logs.
    humioIngestTokenSecretKey: "ingestToken" # key inside above kubernetes secret
     
    #humioDebugTokenSecretName: "" # Optional: kubernetes secret containing ingest token to receive Logscale Collector debug logs.
    #humioDebugTokenSecretKey: "" # key inside above kubernetes secret.
    #logLevel: "warn" # LogScale Collector stderr log level. Valid values: [trace (highest verbosity), debug, info, warn, error (default), fatal]
    
    #queueMemoryLimitMB: "8" # Uncomment to override the default memory queue size.
    #queueFlushTimeoutMillis: "100" # Uncomment to override the default memory queue flush timeout.
     
    #collectJournal: true # Uncomment to also collect Journald logs from the kubernetes nodes.
    #additionalSinkOptions: {} # Uncomment if it's necessary to include additional sink options, such as tls, proxy or similar.
  3. Run the install command below (requires a helm repository with access to our Helm chart), where you must substitute my-install-name for your preferred name, or use the --generate-name argument.

    helm repo add logscale-collector-helm https://registry.crowdstrike.com/log-collector-us1-prod 
    helm install my-install-name logscale-collector-helm/logscale-collector --values logscale-collector.yaml
What is included in the Helm release?

This section describes the details of what is accomplished with the Helm chart.

Part of the Helm chart package is a file called config.yaml file. This file defines the configuration of the LogScale Collector.

sources:
  containers:
    type: file
      include: /var/log/containers/*.log
      exclude: /var/log/containers/{{ .Release.Name }}-logscale-collector-*.log
      transforms:
        - type: kubernetes
      sink: humio
      {{- if .Values.collectJournal }}
      journal:
        type: journald
        sink: humio
      {{- end }}

Two sources are defined:

  1. A file source named containers, for application logs from all application pods.

  2. A journald source for journald logs from the node.

The containers source includes /var/log/containers/*.log files (these are the files with the application logs from the application pods running in the node), and excludes the debug log from the LogScale Collector pod.

This source has a transform associated of type kubernetes. This transform will enrich application logs with kubernetes metadata associated with the pod.

The filenames in /var/log/containers are parsed to obtain information required to query the Kubernetes API for metadata which will be used to enrich the data with the following fields:

  • kubernetes.image

  • kubernetes.image_hash

  • kubernetes.container_name

  • kubernetes.container_id

  • kubernetes.node_name

  • kubernetes.namespace

  • kubernetes.pod_name

  • kubernetes.pod_id

  • A list of labels:

    • kubernetes.labels.<label_1>

    • kubernetes.labels.<label_n>

  • A list of annotations:

    • kubernetes.annotations.<annotation_1>

    • kubernetes.annotations.<annotation_n>

The connection to fetching the metadata is made based on the following assumptions:

A token & certificate can be fetched using these paths:

  • Token file: /var/run/secrets/kubernetes.io/serviceaccount/token

  • Trusted root certificate: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

And we rely on the following environment variables:

  • Kubernetes API hostname: KUBERNETES_SERVICE_HOST

  • Kubernetes API port: KUBERNETES_SERVICE_PORT_HTTPS

The fetched metadata is cached for performance.

The transform fetches pod name, namespace and container id from the log file name, i.e:

kube-apiserver-docker-desktop_kube-system_kube-apiserver-479ebd6ee0267ba8b215e90d389517be2f4896cb42340e5ba05f357e213ffed2.log

Using the regex:

^([^_]+)_([^_]+)_(.+)-([A-Fa-f0-9]+)\.log$

To produce the following:

Pod Name: kube-apiserver-docker-desktop
        Namespace: kube-system
        Container name: kube-apiserver
        Container ID: 479ebd6ee0267ba8b215e90d389517be2f4896cb42340e5ba05f357e213ffed2

Helm Chart Adding Additional Metadata

Available: Additional Metadata v1.0.4

this feature is available as of the 1.0.4 version of Helm Chart.

You can add additional fields to the logs collected using the Helm chart installation. The fields are added using the static_fields transform, and support environment variable expansion. Additionally, it's now possible to augment the environment using the Kubernetes Downward API.

To configure additional fields on the container logs, add the following section to your values file, and update your installation:

sources:
  containers:
    staticFields:
      FieldName: Value

To propagate Kubernetes Downward API information this way, you can also add the following section, for example:

additionalEnv:
  - name: MY_NODE_NAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName

Which will add an environment variable to the container MY_NODE_NAME, which can be configured as a static field value:

sources:
  containers:
    staticFields:
      NodeName: "${MY_NODE_NAME}"

Helm Chart with Falcon CWP (Cloud Workload Protection)

The following issues have been noted when using the helm chart with Falcon CWP and can be safely ignored.

  • Container Running As Root (MEDIUM)

  • Container Running With Low UID (MEDIUM)

  • Non Kube System Pod With Host Mount (MEDIUM)

  • Readiness Probe Is Not Configured (MEDIUM)

  • Service Account Token Automount Not Disabled (MEDIUM)

  • Volume Mount With OS Directory Write Permissions (MEDIUM)

  • Workload Mounting With Sensitive OS Directory (MEDIUM)

  • Liveness Probe Is Not Defined (LOW)

  • Missing AppArmor Profile (LOW)

  • Pod or Container Without LimitRange (LOW)

  • Pod or Container Without ResourceQuota (LOW)

  • Secrets As Environment Variables (LOW)

  • Ensure Administrative Boundaries Between Resources (INFO)