Ingesting Unstructured Data
This endpoint should be used when you have unstructured logs in some text format, and you want to give those logs structure to enable easier and more efficient searching in LogScale. Examples fo sources of unstructured data include:
System error logs: Logs generated by applications, operating systems, or services when errors occur. These might not follow a structured format and could vary in content depending on the error. Monitoring system error logs enables you to track and identify common application errors, server crashes, or misconfigurations that result in error messages. You can search for specific keywords, or pattermns in these logs to diagnose issues.
Web server logs: logs generated by web server such as Apache or Nginx, which may not follow a strict format other than for basic fields such as timestamps and messages. Monitoring web traffic, response times, or HTTP status codes for performance monitoring or troubleshooting issues such as 404 errors or slow response times.
User activity logs: logs generated when users perform various actions within an application or service. These logs are often freeform, reflecting user behaviour in different ways depending on context. Tracking user interactions, login attempts, or activity withinn the system for auditing purposes, user bahaviour analysis, or identifying suspicious actions.
Further examples of unstructured log data includes network traffic logs, application debug logs, security incident logs, transaction logs, and email delivery logs.
With unstructured data, it will result in more efficient queries if you can parse the data into fields in Falcon LogScale. To do this you need to use a suitable parser. You can either use a built-in parser, a Marketplace parser, or create a custom parser. You then need to associate this parser with the Ingest Token your client uses to submit data to LogScale - this can be done when you create the token.
Also, you need to build the message posted to LogScale in a format that helps parse the data and add useful metadata. For example, if you were sending four Apache access log lines to LogScale, you would have a structure similar to the following:
[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]
In this case you need to associate an accesslog
parser with
the Ingest Tokens you are using to authenticate
this client. The accesslog
parser will then parse the lines
into suitable fields, which makes querying more efficient and easier, as
you can base your queries on fields, rather than having to use complex
regular expressions to extract the required data. See
Parsing Data for further details.
The fields
object is used to specify
fields that should be added to each of the events when they are parsed. In
the example, all the access log events will have a host field assigned
indicating that the events came from
webhost1
.
The messages
array contains the
individual event messages to be sent to LogScale for ingestion.
The format of these is described in the section
Events.
It is possible to send events of different types in the same request. That is done by adding a new element to the outer array in the previous example.
Events
When sending events, you can set the following fields in the request payload:
Name | Required | Description |
---|---|---|
messages | yes | The raw strings representing the events, each string (line) is translated to a @rawstring in LogScale. Each string can be parsed by the parser if one is specified. |
type | no | If no parser is attached to the ingest token, LogScale will use this parser. The parser specified is contained in the #type tag. |
fields | no | Annotate each of the messages with metadata using these key-values. Values must be strings. These are translated to user fields. |
tags | no | Annotate each of the messages with metadata using these key-values as Tags. Please see other documentation on tags before using them. |
Examples
Ingesting a single string of data:
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d '[02/Nov/2017:13:48:26 +0000] Daemon started'
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d '[02/Nov/2017:13:48:26 +0000] Daemon started'
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured ^
-H "Authorization: Bearer $INGEST_TOKEN" ^
-H "Content-Type: application/json" ^
-d '[02/Nov/2017:13:48:26 +0000] Daemon started'
curl.exe -X POST
-H "Authorization: Bearer $INGEST_TOKEN"
-H "Content-Type: application/json"
-d '[02/Nov/2017:13:48:26 +0000] Daemon started'
"$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured"
#!/usr/bin/perl
use HTTP::Request;
use LWP;
my $INGEST_TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured';
my $json = '[02/Nov/2017:13:48:26 +0000] Daemon started';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
#! /usr/local/bin/python3
import requests
url = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured'
mydata = r'''[02/Nov/2017:13:48:26 +0000] Daemon started'''
resp = requests.post(url,
data = mydata,
headers = {
"Authorization" : "Bearer $INGEST_TOKEN",
"Content-Type" : "application/json"
}
)
print(resp.text)
const https = require('https');
const data = JSON.stringify(
[02/Nov/2017:13:48:26 +0000] Daemon started
);
const options = {
hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured',
path: '/graphql',
port: 443,
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Content-Length': data.length,
Authorization: 'BEARER ' + process.env.TOKEN,
'User-Agent': 'Node',
},
};
const req = https.request(options, (res) => {
let data = '';
console.log(`statusCode: ${res.statusCode}`);
res.on('data', (d) => {
data += d;
});
res.on('end', () => {
console.log(JSON.parse(data).data);
});
});
req.on('error', (error) => {
console.error(error);
});
req.write(data);
req.end();
Ingesting four lines of unstructured data:
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d @- << EOF
[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]
EOF
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d @- << EOF
[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]
EOF
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured ^
-H "Authorization: Bearer $INGEST_TOKEN" ^
-H "Content-Type: application/json" ^
-d @'[ ^
{ ^
"fields": { ^
"host": "webhost1" ^
}, ^
"messages": [ ^
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015", ^
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014", ^
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013", ^
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015" ^
] ^
} ^
] '
curl.exe -X POST
-H "Authorization: Bearer $INGEST_TOKEN"
-H "Content-Type: application/json"
-d '[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]'
"$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured"
#!/usr/bin/perl
use HTTP::Request;
use LWP;
my $INGEST_TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured';
my $json = '[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
#! /usr/local/bin/python3
import requests
url = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured'
mydata = r'''[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]'''
resp = requests.post(url,
data = mydata,
headers = {
"Authorization" : "Bearer $INGEST_TOKEN",
"Content-Type" : "application/json"
}
)
print(resp.text)
const https = require('https');
const data = JSON.stringify(
[
{
"fields": {
"host": "webhost1"
},
"messages": [
"192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
"192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
"192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
"192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
]
}
]
);
const options = {
hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured',
path: '/graphql',
port: 443,
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Content-Length': data.length,
Authorization: 'BEARER ' + process.env.TOKEN,
'User-Agent': 'Node',
},
};
const req = https.request(options, (res) => {
let data = '';
console.log(`statusCode: ${res.statusCode}`);
res.on('data', (d) => {
data += d;
});
res.on('end', () => {
console.log(JSON.parse(data).data);
});
});
req.on('error', (error) => {
console.error(error);
});
req.write(data);
req.end();