Ingest API
There are different ways of getting data into LogScale. This page shows how to send data using the HTTP API.
There are two endpoints. One for sending data that needs to be parsed using a specified parser. And another endpoint for sending data that is already structured. When parsing text logs like syslogs, accesslogs or logs from applications you typically use the endpoint where a parser is specified.
Ingest via API Best Practices
When sending POST requests with logs for ingesting, we recommend that logs be batched together in single requests, as sending one request per log message won’t scale well.
A good strategy to start with, is to batch log messages up in five second windows, and send all log messages from that time frame in one request.
However, requests can also grow too large. We recommend ingest requests should have no more than 5000 events and take up no more than 5 MB of space (uncompressed). If your requests grow larger than this during your batching time frame, it’s better to break the logs into multiple requests.
Ingest Unstructured Data Using a Parser
This endpoint should be used when you have unstructured logs in some text format, and you want to give those logs structure to enable easier searching in LogScale. You can either use a built-in parser or create a custom parser.
Note
Another option is to use Falcon Logscale Collector
POST /api/v1/ingest/humio-unstructured
Example sending 4 accesslog lines to LogScale
[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1..21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]
The above example sends 4 accesslog lines to LogScale. In this case we have attached an accesslog parser to the Ingest Tokens we are using. See Parsers for details.
The fields
section is used to
specify fields that should be added to each of the events when they are
parsed. In the example, all the accesslog events will get a host field
telling the events came from
webhost1
. It is possible to send
events of different types in the same request. That is done by adding a
new element to the outer array in the example above.
Tags can be specified through the parser.
Events
When sending events, you can set the following standard fields:
Name | Required | Description |
---|---|---|
messages
| yes | The raw strings representing the events. Each string will be parsed by the parser. |
type
| no | If no Parsers is attached to the ingest token LogScale will use this parser. |
fields
| no |
Annotate each of the
messages with these
key-values. Values must be strings.
|
tags
| no |
Annotate each of the
messages with these
key-values as Tags. Please see other documentation on tags
before using them.
|
Examples
Previous example as a curl command:
curl -v -X POST localhost:8080/api/v1/ingest/humio-unstructured \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $INGEST_TOKEN" \
-d @- << EOF
[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]
EOF
Ingest Structured Data
This API should be used when data is already structured. An extra parsing step is possible by attaching a parser to the used ingest-token.
POST /api/v1/ingest/humio-structured
The following example request contains two events. Both these events share the same tags:
[
{
"tags": {
"host": "server1",
"source": "application.log"
},
"events": [
{
"timestamp": "2016-06-06T12:00:00+02:00",
"attributes": {
"key1": "value1",
"key2": "value2"
}
},
{
"timestamp": "2016-06-06T12:00:01+02:00",
"attributes": {
"key1": "value1"
}
}
]
}
]
You can also batch events with different tags into the same request as shown in the following example.
This request contains three events. The first two are tagged with
server1
and the third is tagged
with server2
:
[
{
"tags": {
"host": "server1",
"source": "application.log"
},
"events": [
{
"timestamp": "2016-06-06T13:00:00+02:00",
"attributes": {
"hello": "world"
}
},
{
"timestamp": "2016-06-06T13:00:01+02:00",
"attributes": {
"statuscode": "200",
"url": "/index.html"
}
}
]
},
{
"tags": {
"host": "server2",
"source": "application.log"
},
"events": [
{
"timestamp": "2016-06-06T13:00:02+02:00",
"attributes": {
"key1": "value1"
}
}
]
}
]
Tags
Tags are key-value pairs.
Events are stored in data sources. A repository has a set of data sources. Data sources are defined by their tags. An event is stored in a data source matching its tags. If no data source with the exact tags exists it is created. Tags are used to optimize searches by filtering out unwanted events. At least one tag must be specified. See the Event Tags documentation for more information.
Events
When sending an Event, you can set the following standard fields:
Name | Required | Description |
---|---|---|
timestamp
| yes | You can specify the timestamp in two formats. You can specify a number that sets the time in milliseconds (Unix time). The number must be in UTC time, not local time. Alternatively, you can set the timestamp as an ISO 8601 formatted string, for example, yyyy-MM-dd'T'HH:mm:ss.SSSZ. |
timezone
| no |
The timezone is only
required if you specify the timestamp in milliseconds. The
timezone specifies the local timezone for the event. Note that
you must still specify the
timestamp in UTC time.
|
attributes
| no | A JSON object representing key-value pairs for the Event. These key-value pairs adds structure to Events, making it easier to search. Attributes can be nested JSON objects, however, we recommend limiting the amount of nesting. |
rawstring
| no |
The raw string representing the Event. The default display for
an Event in LogScale is the rawstring. If you do not provide the
rawstring field, then
the response defaults to a JSON representation of the
attributes field.
|
Below are some examples of events:
{
"timestamp": "2016-06-06T12:00:00+02:00",
"attributes": {
"key1": "value1",
"key2": "value2"
}
}
{
"timestamp": 1466105321,
"attributes": {
"service": "coordinator"
},
"rawstring": "starting service coordinator"
}
{
"timestamp": 1466105321,
"timezone": "Europe/Copenhagen",
"attributes": {
"service": "coordinator"
},
"rawstring": "starting service coordinator"
}
{
"timestamp": "2016-06-06T12:00:01+02:00",
"rawstring": "starting service=coordinator transactionid=42"
}
Response
Standard HTTP response codes.
Example
curl $YOUR_LOGSCALE_URL/api/v1/ingest/humio-structured \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $INGEST_TOKEN" \
-d '[{"tags": {"host":"myserver"}, "events" : [{"timestamp": "2016-06-06T12:00:00+02:00", "attributes": {"key1":"value1"}}]}]'
Ingest Raw data
This endpoint should be used, when you are not in control of the request body, such as in the case of calling LogScale via a callback from another system.
POST /api/v1/ingest/raw
The body of the HTTP request will be interpreted as a single event, and parsed using the parser attached to the accompanying ingest token. Unless the parser created generates a @timestamp field, the @timestamp of the resulting event will equal @ingesttimestamp.
Note
This endpoint is not suited for ingesting a large number of events, and its usage should be restricted to relatively infrequent calls.
Response
Standard HTTP response codes.
Example
When ingesting raw data, you can choose to authenticate by attaching your ingest token to the header:
curl $YOUR_LOGSCALE_URL/api/v1/ingest/raw \
-X POST \
-H "Authorization: Bearer $INGEST_TOKEN" \
-d 'My raw Message generated at "2016-06-06T12:00:00+02:00"'
Or by adding it to the url as part of the path:
curl $YOUR_LOGSCALE_URL/api/v1/ingest/raw/$INGEST_TOKEN \
-X POST \
-d 'My raw Message generated at "2016-06-06T12:00:00+02:00"'
For this second option, please be aware that the url should not be considered secret, as it might be logged in LogScale or in proxy servers which process the request. Thus, we recommend that you authenticate through the header if possible.