Ingesting Data

After establishing a Humio Cloud account or installing Humio on a server, you’ll want to put in place a system to feed data automatically into Humio; this loading of information into is known as ingesting data.

  • Create a Repository

    A repository is the core storage mecahnism within Humio and is used to collate the information from one of more logs so that it can be indexed and queried.

  • Generate Ingest Tokens

    Each repository can have one or more ingest tokens associated with it. Ingest tokens are used with the Ingest API to enable data to be routed to the right repository, and to associate a suitable parser.

  • Install & Configure Log Shippers

    Log shippers use the Ingest API to send one or more logs to Humio. A log shipper can handle multipl logs, multiple log types, manage the log storage on disk, and pre-process the logs before sending them to Humio. Log shippers are covered in more detailed in Log Shippers.

  • Parse the Data

    Parsing the data that is ingested enables the information to be tagged, specific fields and elements of the log data to be extracted, and enables an additional level of detail. The use of a parser also enables the type of the data and fields extracted to be configured, supporting metrics, graphing and dashboards.

In most cases you will want to use a log shipper or one of our platform integrations. If you are interested in getting some data into Humio quickly, see the Ingesting Application Logs tutorial page.

Humio is optimized for live streaming of events in real time. If you ship data that are not live, you need to observe some basic rules so that the resulting events are stored in Humio as efficiently as if they had been received live. See Backfilling.

You may use the Ingest API directly or through one of Humio’s client libraries.

See the Ingest API reference page for more information on it. For a list of software that is supported, see the Software Libraries in the Appendix.

Log Shippers

Figure 1, Ingest Process

A Log Shipper is a system tool that looks at files and system properties on a server and sends them to Humio. Log shippers take care of buffering, retransmitting lost messages, log file rolling, network disconnects, and other log management functions to ensure that you send data to Humio in a reliable, and regular, fashion.

In Figure 1, Your Application is writing logs to a log file. The log shipper reads the data and pre-processes it (for example, this could be converting a multiline stack-trace into a single event). It then ships the data to Humio on one of our Ingest APIs.

Data shipping can be done a few ways:

You can find a list of supported log shippers on the Log Shippers section of the Documentation.

Platform Integrations

Figure 2, Platform Integration

Depending on your platform, the data flow will look slightly different from Figure 2. Some systems use a built-in logging subsystem, others have you start a container running with a log shipper. Usually you will assign containers or instances in some way to indicate which repository and parser should be used at ingestion.

If you want to get logs and metrics from your deployment platform, like a Kubernetes cluster or your company PaaS, see the Provisioning and Containers installation sections.

Take a look at the list of integration options for more details.