Ingesting with OpenTelemetry

LogScale supports OpenTelemetry (OTel), a tool for ingesting telemetry data (logs, metrics, and traces) into Falcon LogScale. LogScale can receive data sent using the OpenTelemetry Protocol (OTLP) thus enabling collecting data from any source which uses OpenTelemetry. Falcon LogScale's OTLP endpoint receives binary-encoded Protobuf data sent over HTTP. If the source uses JSON-encoded Protobuf then the data must be routed through the OpenTelemetry Collector.

When you use OpenTelemetry Protocol over HTTP (OTLP/HTTP) as the method of sending data to LogScale, the ingested data is translated into a JSON structure and then a parser can be applied to the data. For authentication use the authorization header with an ingest token:

http
Authorization: "Bearer ingest token".

OpenTelemetry works well when ingesting logs into LogScale. You can also ingest traces and metrics, taking advantage of Falcon LogScale as a scalable storage solution. Due to how LogScale handles event data, some metrics are treated as typical events, but some traces might require additional querying using the LogScale query language.

The following examples shows how to configure an OpenTelemetry Collector exporter for traces, metrics, and logs pipelines. The data is automatically routed to the correct place within LogScale.

logscale
exporters:
  otlphttp:
    endpoint: "http://$YOUR_LOGSCALE_URL/api/v1/ingest/otlp"
    headers:
      Authorization: "Bearer $INGEST_TOKEN"

Using OpenTelemetry in Kubernetes

When working within Kubernetes, OpenTelemetry Collector requires some configuration changes to enable the data to be transferred correctly within the Kubernetes specific environment.

To make these changes, the collector configuration file must be updated, first to update the exporters, then the data pipelines, and finally the feed of information to Falcon LogScale:

  1. Define exporters in the configuration file:

    yaml
    exporters:
          otlphttp:
            endpoint: "https://$YOUR_LOGSCALE_URL/api/v1/ingest/otlp"
            headers:
              Authorization: "Bearer <replaceable>$INGEST_TOKEN</replaceable>"

    The configuration updates the following:

    • Using otlphttp as the method, ingested data is then translated into a JSON structure and allows for additional parsing to be applied.

    • Updates the header to include the Bearer to the authentication token as required by LogScale.

  2. Define the pipelines in the collector configuration:

    yaml
    pipelines:
            logs/humio:
              exporters:
              - otlphttp
              processors:
              - memory_limiter
              - k8sattributes
              - filter/logs
              - batch
              - resource
              - transform/logs
              - resource/logs
              - resourcedetection
              receivers:
              - filelog/humio
              - fluentforward
              - otlp

    This adds the exporters, processors and receivers sections to the collector configuration.

  3. Define filelog and configure the format and pre-processing of the information as part of the receivers section of the configuration:

    yaml
    filelog/humio:
            encoding: utf-8
            fingerprint_size: 1kb
            force_flush_period: "0"
            include:
            - /var/*.log
            include_file_name: false
            include_file_path: true
            max_concurrent_files: 1024
            max_log_size: 1MiB
            operators:
            - id: get-format
              routes:
              - expr: body matches "^\\{"
                output: parser-docker
              - expr: body matches "^[^ Z]+ "
                output: parser-crio
              - expr: body matches "^[^ Z]+Z"
                output: parser-containerd
              type: router
            - id: parser-crio
              regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
              timestamp:
                layout: "2006-01-02T15:04:05.999999999-07:00"
                layout_type: gotime
                parse_from: attributes.time
              type: regex_parser
            - combine_field: attributes.log
              combine_with: ""
              id: crio-recombine
              is_last_entry: attributes.logtag == 'F'
              max_log_size: 1048576
              output: handle_empty_log
              source_identifier: attributes["log.file.path"]
              type: recombine
            - id: parser-containerd
              regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
              timestamp:
                layout: '%Y-%m-%dT%H:%M:%S.%LZ'
                parse_from: attributes.time
              type: regex_parser
            - combine_field: attributes.log
              combine_with: ""
              id: containerd-recombine
              is_last_entry: attributes.logtag == 'F'
              max_log_size: 1048576
              output: handle_empty_log
              source_identifier: attributes["log.file.path"]
              type: recombine
            - id: parser-docker
              timestamp:
                layout: '%Y-%m-%dT%H:%M:%S.%LZ'
                parse_from: attributes.time
              type: json_parser
            - combine_field: attributes.log
              combine_with: ""
              id: docker-recombine
              is_last_entry: attributes.log endsWith "\n"
              max_log_size: 1048576
              output: handle_empty_log
              source_identifier: attributes["log.file.path"]
              type: recombine
            - field: attributes.log
              id: handle_empty_log
              if: attributes.log == nil
              type: add
              value: ""
            - parse_from: attributes["log.file.path"]
              regex: ^\/var\/log\/pods\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[^\/]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
              type: regex_parser
            - from: attributes.uid
              to: resource["k8s.pod.uid"]
              type: move
            - from: attributes.restart_count
              to: resource["k8s.container.restart_count"]
              type: move
            - from: attributes.container_name
              to: resource["k8s.container.name"]
              type: move
            - from: attributes.namespace
              to: resource["k8s.namespace.name"]
              type: move
            - from: attributes.pod_name
              to: resource["k8s.pod.name"]
              type: move
            - from: attributes.stream
              to: attributes["log.iostream"]
              type: move
            - from: attributes.log
              id: clean-up-log-record
              to: body
              type: move
            poll_interval: 200ms 
            start_at: beginning
            storage: file_storage

    For more details refer the OpenTelemetry Collector documentation

Once the changes are in place, deploy the OpenTelemetry Collector as a daemonset for the logs to be shipped to LogScale.