Ingestion: API Phase

Ingestion is supported through a number of different APIs, some specific to LogScale, some application-specific that support the protocols of other and standards. Most organizations will ingest data using one of the dedicated Third-Party Log Shippers, including Falcon LogScale Collector.

Common protocols supported in LogScale are:

  • OpenTelemetry

    Open standard protocol project handling logs, metrics and tracing being part of CNCF.

  • HTTP Event Collector (HEC)

    This protocol was originally created to send data to Splunk. This protocol is used by many log shippers and third party tools and exists as a common standard for ingesting large volumes of data.

    The protocol focuses on being a fast and efficient way of sending data in a structured format. It is very close to being JSON and supports having key-values in log events.

  • Elastic Search Bulk Ingest API

    A large volume ingest API that supports loading larger blocks of data for processing.

  • StatsD

    The StatsD protocol is a great way of shipping raw metrics to LogScale from a lot of tools

There a number of supported Ingest API formats:

  • Ingest Raw Data

    An endpoint that accepts raw log lines that can be processed by the LogScale parser to extract the information for storage.

  • Ingest Structured Data

    An endpoint for accepting structured data, that is, data where fields and values have already been identified and formatted and which then only need to be processed and stored.

  • Ingest Raw JSON Data

    An endpoint that accepts raw data formatted as JSON data that can be processed or augmented by the parser and stored.

The intention with each API is to make it as easy as possible for data to be sent to LogScale. For example, using the 'unstructured' native LogScale endpoint allows for sending a series of raw log rows:

json
[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
      "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
      "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
      "192.168.1..21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
      "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]

More typically data is ingested into LogScale through a Log Shipper that specifies the log files to be sent to LogScale. The log shipper will handle the process of reading the file and sending the information up to LogScale. For example, Falcon LogScale Collector uses a log file that defines the log files and target LogScale instance:

yaml
sources:
  var_log:
    type: file
    include: /var/log/*
    sink: logscale
sinks:
  logscale:
    type: humio
    token: 960e4dec-33b6-4bc3-8a9c-b4aa6df15416
    url: https://logscale.example.com:443

Regardless of the log shipper or other tool you are using to upload data to LogScale, the data will need to be processed by LogScale before being stored so that the data can be queried.