Ingest API

There are different ways of getting data into LogScale. This page shows how to send data using the HTTP API.

There are two endpoints. One for sending data that needs to be parsed using a specified parser. And another endpoint for sending data that is already structured. When parsing text logs like syslogs, accesslogs or logs from applications you typically use the endpoint where a parser is specified.

Ingest API Response Codes

The Ingest API responds with a standard HTTP response codes:

  • 200

    Data has been received and committed.

  • 201-399

    An error has occurred, check the error text to confirm the error and then retried if possible

  • 401 or 403

    Indicates that the authorization token is incorrect. The operation can be retried if a new API token is used.

  • 4xx (excluding 401 and 403

    Cannot be retried.

  • 5xx

    Errors in the 5xx range can be retried as it may be a temporary error.

Ingest via API Best Practices

When sending POST requests with logs for ingesting, we recommend that logs be batched together in single requests, as sending one request per log message won't scale well.

A good strategy to start with, is to batch log messages up in five second windows, and send all log messages from that time frame in one request.

However, requests can also grow too large. We recommend ingest requests should have no more than 5000 events and take up no more than 5 MB of space (uncompressed). If your requests grow larger than this during your batching time frame, it's better to break the logs into multiple requests. Please refer to Limits & Standards when ingesting large bulks of events.

Ingest Unstructured Data Using a Parser

This endpoint should be used when you have unstructured logs in some text format, and you want to give those logs structure to enable easier searching in LogScale. You can either use a built-in parser or create a custom parser.

Note

Another option is to use Falcon LogScale Collector

http
POST /api/v1/ingest/humio-unstructured

Example sending 4 accesslog lines to LogScale

json
[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
      "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
      "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
      "192.168.1..21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
      "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]

The above example sends 4 accesslog lines to LogScale. In this case we have attached an accesslog parser to the Ingest Tokens we are using. See Parsing Data for details.

The fields section is used to specify fields that should be added to each of the events when they are parsed. In the example, all the accesslog events will get a host field telling the events came from webhost1. It is possible to send events of different types in the same request. That is done by adding a new element to the outer array in the example above.

Tags can be specified through the parser.

Events

When sending events, you can set the following standard fields:

Name Required Description
messages yes The raw strings representing the events. Each string will be parsed by the parser.
type no If no Parsing Data is attached to the ingest token LogScale will use this parser.
fields no Annotate each of the messages with these key-values. Values must be strings.
tags no Annotate each of the messages with these key-values as Tags. Please see other documentation on tags before using them.

Examples

Ingesting a single string of data:

Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json" \
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json" \
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
Windows Cmd and curl
cmd
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured ^
    -H "Authorization: Bearer $INGEST_TOKEN" ^
    -H "Content-Type: application/json" ^
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $INGEST_TOKEN"
    -H "Content-Type: application/json"
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
"$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured';
my $json = '[02/Nov/2017:13:48:26 +0000] Daemon started';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured'
mydata = r'''[02/Nov/2017:13:48:26 +0000] Daemon started'''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $INGEST_TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

const data = JSON.stringify(
    [02/Nov/2017:13:48:26 +0000] Daemon started
);


const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

Ingesting four lines of unstructured data using a curl command:

Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json" \
    -d @- << EOF
[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]
EOF
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
cmd
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured ^
    -H "Authorization: Bearer $INGEST_TOKEN" ^
    -H "Content-Type: application/json" ^
    -d @'[ ^
  { ^
    "fields": { ^
      "host": "webhost1" ^
    }, ^
    "messages": [ ^
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015", ^
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014", ^
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013", ^
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015" ^
    ] ^
  } ^
] '
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $INGEST_TOKEN"
    -H "Content-Type: application/json"
    -d '[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]'
"$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured';
my $json = '[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured'
mydata = r'''[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]'''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $INGEST_TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

const data = JSON.stringify(
    [
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]
);


const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

Ingest Structured Data

This API should be used when data is already structured. An extra parsing step is possible by attaching a parser to the used ingest-token.

http
POST /api/v1/ingest/humio-structured

The following example request contains two events. Both these events share the same tags:

json
[
  {
    "tags": {
      "host": "server1",
      "source": "application.log"
    },
    "events": [
      {
        "timestamp": "2016-06-06T12:00:00+02:00",
        "attributes": {
          "key1": "value1",
          "key2": "value2"
        }
      },
      {
        "timestamp": "2016-06-06T12:00:01+02:00",
        "attributes": {
          "key1": "value1"
        }
      }
    ]
  }
]

You can also batch events with different tags into the same request as shown in the following example.

This request contains three events. The first two are tagged with server1 and the third is tagged with server2:

json
[
  {
    "tags": {
      "host": "server1",
      "source": "application.log"
    },
    "events": [
      {
        "timestamp": "2016-06-06T13:00:00+02:00",
        "attributes": {
          "hello": "world"
        }
      },
      {
        "timestamp": "2016-06-06T13:00:01+02:00",
        "attributes": {
          "statuscode": "200",
          "url": "/index.html"
        }
      }
    ]
  },
  {
    "tags": {
      "host": "server2",
      "source": "application.log"
    },
    "events": [
      {
        "timestamp": "2016-06-06T13:00:02+02:00",
        "attributes": {
          "key1": "value1"
        }
      }
    ]
  }
]

Tags

Tags are key-value pairs.

Events are stored in data sources. A repository has a set of data sources. Data sources are defined by their tags. An event is stored in a data source matching its tags. If no data source with the exact tags exists it is created. Tags are used to optimize searches by filtering out unwanted events. At least one tag must be specified. See the Event Tags documentation for more information.

Events

When sending an Event, you can set the following standard fields:

Name Required Description
timestamp yes You can specify the timestamp in two formats. You can specify a number that sets the time in milliseconds (Unix time). The number must be in UTC time, not local time. Alternatively, you can set the timestamp as an ISO 8601 formatted string, for example, yyyy-MM-dd'T'HH:mm:ss.SSSZ.
timezone no The timezone is only required if you specify the timestamp in milliseconds. The timezone specifies the local timezone for the event. Note that you must still specify the timestamp in UTC time.
attributes no A JSON object representing key-value pairs for the Event. These key-value pairs adds structure to Events, making it easier to search. Attributes can be nested JSON objects, however, we recommend limiting the amount of nesting.
rawstring no The raw string representing the Event. The default display for an Event in LogScale is the rawstring. If you do not provide the rawstring field, then the response defaults to a JSON representation of the attributes field.

Below are some examples of events:

json
{
  "timestamp": "2016-06-06T12:00:00+02:00",
  "attributes": {
    "key1": "value1",
    "key2": "value2"
  }
}
json
{
  "timestamp": 1466105321,
  "attributes": {
    "service": "coordinator"
  },
  "rawstring": "starting service coordinator"
}
json
{
  "timestamp": 1466105321,
  "timezone": "Europe/Copenhagen",
  "attributes": {
    "service": "coordinator"
  },
  "rawstring": "starting service coordinator"
}
json
{
  "timestamp": "2016-06-06T12:00:01+02:00",
  "rawstring": "starting service=coordinator transactionid=42"
}

Example

macOS or Linux
shell
$ curl $YOUR_LOGSCALE_URL/api/v1/ingest/humio-structured \
     -X POST \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer $INGEST_TOKEN" \
     -d '[{"tags": {"host":"myserver"}, "events" : [{"timestamp": "2016-06-06T12:00:00+02:00", "attributes": {"key1":"value1"}}]}]'
Windows
powershell
C:\> curl $YOUR_LOGSCALE_URL/api/v1/ingest/humio-structured ^
     -X POST ^
     -H "Content-Type: application/json" ^
     -H "Authorization: Bearer $INGEST_TOKEN" ^
     -d '[{"tags": {"host":"myserver"}, "events" : [{"timestamp": "2016-06-06T12:00:00+02:00", "attributes": {"key1":"value1"}}]}]'

Ingest Raw Data

This endpoint should be used, when you are not in control of the request body, such as in the case of calling LogScale via a callback from another system.

http
POST /api/v1/ingest/raw

The body of the HTTP request will be interpreted as a single event, and parsed using the parser attached to the accompanying ingest token. Unless the parser created generates a @timestamp field, the @timestamp of the resulting event will equal @ingesttimestamp.

Note

This endpoint is not suited for ingesting a large number of events, and its usage should be restricted to relatively infrequent calls.

Example

When ingesting raw data, you can choose to authenticate by attaching your ingest token to the header:

macOS or Linux
shell
$ curl $YOUR_LOGSCALE_URL/api/v1/ingest/raw \
     -X POST \
     -H "Authorization: Bearer $INGEST_TOKEN" \
     -d 'My raw Message generated at "2016-06-06T12:00:00+02:00"'
Windows
powershell
C:\> curl $YOUR_LOGSCALE_URL/api/v1/ingest/raw ^
     -X POST ^
     -H "Authorization: Bearer $INGEST_TOKEN" ^
     -d 'My raw Message generated at "2016-06-06T12:00:00+02:00"'

Or by adding it to the url as part of the path:

macOS or Linux
shell
$ curl $YOUR_LOGSCALE_URL/api/v1/ingest/raw/$INGEST_TOKEN \
     -X POST \
     -d 'My raw Message generated at "2016-06-06T12:00:00+02:00"'
Windows
powershell
C:\> curl $YOUR_LOGSCALE_URL/api/v1/ingest/raw/$INGEST_TOKEN ^
     -X POST ^
     -d 'My raw Message generated at "2016-06-06T12:00:00+02:00"'

For this second option, please be aware that the url should not be considered secret, as it might be logged in LogScale or in proxy servers which process the request. Thus, we recommend that you authenticate through the header if possible.

Ingest Raw JSON Data

Like the raw endpoint, this endpoint should be used when you are not in control of the message body, but the request body is interpreted as JSON instead.

http
POST /api/v1/ingest/json

If the received payload is a JSON object, the entire object is treated as a single event. If the received payload is a JSON array, each item in the array is treated as an event.

This is useful for receiving webhooks that can only send JSON arrays of multiple events.

It is especially useful if you have long arrays, where the total number of fields — usually events are objects, not strings — exceeds 1000, which is our internal limit. When this happens you can't use the raw endpoint with parseJson() and split() because there are simply too many fields.

Like for the raw endpoint, the @ingesttimestamp will be assigned to the events as @timestamp if the parser associated with the ingest token does not assign one.

Example

macOS or Linux
shell
$ curl $YOUR_LOGSCALE_URL/api/v1/ingest/json \
     -X POST \
     -H "Authorization: Bearer $INGEST_TOKEN" \
     -d '["first event", "second event"]'
Windows
powershell
C:\> curl $YOUR_LOGSCALE_URL/api/v1/ingest/json ^
     -X POST ^
     -H "Authorization: Bearer $INGEST_TOKEN" ^
     -d '["first event", "second event"]'