Ingesting Unstructured Data

This endpoint should be used when you have unstructured logs in some text format, and you want to give those logs structure to enable easier and more efficient searching in LogScale. Examples fo sources of unstructured data include:

  • System error logs: Logs generated by applications, operating systems, or services when errors occur. These might not follow a structured format and could vary in content depending on the error. Monitoring system error logs enables you to track and identify common application errors, server crashes, or misconfigurations that result in error messages. You can search for specific keywords, or pattermns in these logs to diagnose issues.

  • Web server logs: logs generated by web server such as Apache or Nginx, which may not follow a strict format other than for basic fields such as timestamps and messages. Monitoring web traffic, response times, or HTTP status codes for performance monitoring or troubleshooting issues such as 404 errors or slow response times.

  • User activity logs: logs generated when users perform various actions within an application or service. These logs are often freeform, reflecting user behaviour in different ways depending on context. Tracking user interactions, login attempts, or activity withinn the system for auditing purposes, user bahaviour analysis, or identifying suspicious actions.

Further examples of unstructured log data includes network traffic logs, application debug logs, security incident logs, transaction logs, and email delivery logs.

With unstructured data, it will result in more efficient queries if you can parse the data into fields in Falcon LogScale. To do this you need to use a suitable parser. You can either use a built-in parser, a Marketplace parser, or create a custom parser. You then need to associate this parser with the Ingest Token your client uses to submit data to LogScale - this can be done when you create the token.

Also, you need to build the message posted to LogScale in a format that helps parse the data and add useful metadata. For example, if you were sending four Apache access log lines to LogScale, you would have a structure similar to the following:

json
[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
      "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
      "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
      "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
      "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]

In this case you need to associate an accesslog parser with the Ingest Tokens you are using to authenticate this client. The accesslog parser will then parse the lines into suitable fields, which makes querying more efficient and easier, as you can base your queries on fields, rather than having to use complex regular expressions to extract the required data. See Parsing Data for further details.

The fields object is used to specify fields that should be added to each of the events when they are parsed. In the example, all the access log events will have a host field assigned indicating that the events came from webhost1.

The messages array contains the individual event messages to be sent to LogScale for ingestion. The format of these is described in the section Events.

It is possible to send events of different types in the same request. That is done by adding a new element to the outer array in the previous example.

Events

When sending events, you can set the following fields in the request payload:

Name Required Description
messages yes The raw strings representing the events, each string (line) is translated to a @rawstring in LogScale. Each string can be parsed by the parser if one is specified.
type no If no parser is attached to the ingest token, LogScale will use this parser. The parser specified is contained in the #type tag.
fields no Annotate each of the messages with metadata using these key-values. Values must be strings. These are translated to user fields.
tags no Annotate each of the messages with metadata using these key-values as Tags. Please see other documentation on tags before using them.

Examples

Ingesting a single string of data:

Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json" \
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json" \
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
Windows Cmd and curl
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured ^
    -H "Authorization: Bearer $INGEST_TOKEN" ^
    -H "Content-Type: application/json" ^
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $INGEST_TOKEN"
    -H "Content-Type: application/json"
    -d '[02/Nov/2017:13:48:26 +0000] Daemon started'
"$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $INGEST_TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured';

my $json = '[02/Nov/2017:13:48:26 +0000] Daemon started';
my $req = HTTP::Request->new("POST", $uri );

$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");

$req->content( $json );

my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured'
mydata = r'''[02/Nov/2017:13:48:26 +0000] Daemon started'''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $INGEST_TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

const data = JSON.stringify(
    [02/Nov/2017:13:48:26 +0000] Daemon started
);


const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

Ingesting four lines of unstructured data:

Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json" \
    -d @- << EOF
[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]
EOF
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured \
    -H "Authorization: Bearer $INGEST_TOKEN" \
    -H "Content-Type: application/json" \
    -d @- << EOF
[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]
EOF
Windows Cmd and curl
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured ^
    -H "Authorization: Bearer $INGEST_TOKEN" ^
    -H "Content-Type: application/json" ^
    -d @'[ ^
  { ^
    "fields": { ^
      "host": "webhost1" ^
    }, ^
    "messages": [ ^
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015", ^
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014", ^
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013", ^
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015" ^
    ] ^
  } ^
] '
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $INGEST_TOKEN"
    -H "Content-Type: application/json"
    -d '[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]'
"$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $INGEST_TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured';

my $json = '[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]';
my $req = HTTP::Request->new("POST", $uri );

$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");

$req->content( $json );

my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured'
mydata = r'''[
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]'''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $INGEST_TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

const data = JSON.stringify(
    [
  {
    "fields": {
      "host": "webhost1"
    },
    "messages": [
       "192.168.1.21 - user1 [02/Nov/2017:13:48:26 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 664 0.015",
       "192.168.1.49 - user1 [02/Nov/2017:13:48:33 +0000] \"POST /humio/api/v1/ingest/elastic-bulk HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.014 657 0.014",
       "192.168.1.21 - user2 [02/Nov/2017:13:49:09 +0000] \"POST /humio/api/v1/repositories/humio HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.013 565 0.013",
       "192.168.1.54 - user1 [02/Nov/2017:13:49:10 +0000] \"POST /humio/api/v1/repositories/humio/queryjobs HTTP/1.1\" 200 0 \"-\" \"useragent\" 0.015 650 0.015"
    ]
  }
]
);


const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/humio-unstructured',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();