Ingesting Across Multiple Repositories
The HEC endpoint supports using different types of ingest tokens:
Repository ingest tokens - these are the tokens that you create to ingest to a specific repository only.
Organization ingest tokens - these tokens enable you to ingest data into all repositories within an organization (except sandbox and system repositories).
System ingest tokens - these tokens enable you to ingest data into all repositories in a cluster.
Prerequisites
To be able to create the organization and system tokens you need to
enable the
PermissionTokens
feature flag. When running a new instance you can enable this by
setting the INITIAL_FEATURE_FLAGS
environment variable
as follows:
INITIAL_FEATURE_FLAGS=+PermissionTokens
This works from build 1.36 and later. For older builds, or if you need
to enable the flag for an already-running instance, log in as root and
navigate to https://$YOUR_HUMIO_URL/docs/api-explorer
.
This gives you a GraphQL console where you can run the following mutation:
mutation {
enableFeature(feature:PermissionTokens)
}
You may need to log out and then back in for the change to take effect.
Ingest Tokens
In previous versions of LogScale, the HEC required you to
provide an ingest token that was tied to a particular repository. With
it you could write to that specific repository or, if you enabled
ALLOW_CHANGE_REPO_ON_EVENTS
, any repo.
Now, the HEC accepts two new multi-repository ingest tokens: organization-wide and system-wide. An organization-wide ingest token allows you to ingest into any repository within the organization it belongs to. The system-wide allows you to ingest into any repository in the cluster. These token types can't ingest into system or sandbox repositories. You can generate these tokens through the UI using the following methods:
For system tokens: Ingest across all repositories in cluster.
→ → → →For organization tokens: Ingest across all repos within organization.
→ → → →
Note
This requires the
PermissionTokens
feature flag to be enabled, as mentioned in
Prerequisites.
When using a multi-repository token you must specify the repository
you want to ingest into using the index
field in a
request to the HEC endpoint. When using a system token, you must also
specify the organization the repository belongs to using the
organization
field.
Note
Repository-specific ingest tokens provide access to ingesting to that repo, without exceptions. For organization and system tokens there are exceptions: they do not permit ingest into system repositories or users' sandboxes. If you want to experiment with multi-repository tokens, using your sandbox won't work.
HTTP Request Fields
With multi-repository suport, the index
field takes
on additional functionality, and new fields have been added. These
fields are shown in the following table:
Field | Description |
---|---|
index
| The name or ID of the target repository to ingest into. For repository-specific ingest tokens, this defaults to the token's repository. |
organization
| The ID of the organization the target repository belongs to. For repository or organization-specific ingest tokens, this defaults to the repository's organization and the token's organization respectively. |
parserIndex
| If specified, the repository with this name or ID will be used to look up the parser to parse the event. If not specified, or invalid, the parser is looked up in the destination repository. It is only possible to specify repositories the ingest token has access to. For an organization-wide ingest token, for instance, you can only specify a repository in that organization. |
parserOrganization
|
If specified, this gives the ID of the organization in which
the parserIndex will be looked up. If not
specified the default value is the same as organization. This
is only useful when using a system-wide ingest token and using
a parser that belongs to a different organization than the
target. As with parserIndex you can only
specify organizations the ingest token has access to.
|
Repositories can be specified using either their name or ID. If you need a guarantee that events are delivered to a particular repository, the safest thing is to use only IDs. This is because names can change whereas IDs are fixed, and in the case of ambiguity it's the interpretation as an ID that is preferred. For organizations you can only use their ID, not their name.
All fields are optional, but note that if you leave out a field and it's one that doesn't have a default value, then a value must be specified through the event's tags. This applies to these fields:
Field | Description |
---|---|
index
| With organization-wide or system-wide tokens, if you don't specify an index in the request, then the target repository will be taken from each event's #repo tag. |
organization
| With system-wide tokens, if you don't specify an organization field in the request then the target organization will be taken from each event's #organization tag. |
See the Event Tags section for more details.
Note
This works in single-organization mode as well, though the model is
simpler. You can leave out the organization
and
parserOrganization
fields - the
index
and parserIndex
will be
looked up within all repositories in the cluster.
Batched Events
There are two ways in which multiple events can be batched together in the same request to the HEC. You can simply include multiple events, separated by newlines, in the same request body. Each event in the sequence is parsed independently, so all fields need to be specified for each event.
The other is for the event field to be a list of values, as outlined in the description of that field. In that case, the other fields apply to all the events listed. Since multi-repository ingest potentially requires metadata to be specified for each event, this mechanism can potentially be useful to avoid repeating that metadata for each event.
Error Reporting
If the system is unable to resolve the destination repository for an
event, the request fails and returns 422
,
UnprocessableEntity
. If the body of the request contains
multiple batched events, then none of the events will be ingested in
that case. The system will never ingest a subset of the events in a
request; it either ingests all events or fails them all. This only
affects multi-repository tokens. With a repository-specific token,
there is always a repository for a given event to arrive in - the one
associated with the token. In that case the system does not fail the
request, but falls back on ingesting into the token's repository.
If the system fails to resolve a parser in the requested
parserIndex
it will fall back to trying to resolve it
in the destination repository, if one has been specified. If that
fails as well, the system ingests the event without parsing it. In
that case an error is recorded on the event by setting the field
@error to true
and
@error_msg to a string describing the problem.
HTTP Response
The response to an ingest request is a JSON object that contains information about the operation. This is mainly informative, and for debugging. As far as reacting to the response, you should mainly rely on the status code and ignore the JSON. Here is an example of a response:
{
"text": "Success",
"code": 0,
"eventCount": 8,
"unresolvedSourcetypes: [{
"index": "myRepo",
"sourcetype": "htpreq"
}],
"unresolvedIndexes": [{
"index": "myReepo",
"organization": "aNInS0WHvORBcymQTp0HoLIKYDygBiwo"
}]
}
This indicates that eight events were successfully ingested, however
the system encountered an event which specified the
sourcetype
htpreq
from repository
myRepo
which didn't exist. Also, an event specified that
it should be ingested into myReepo
which didn't exist.
Since the request as a whole succeeded, that means it must have used a
repository-specific ingest token, so that despite failing to resolve
myReepo
, the event was still ingested, but into the
token's default repository.
Examples
Here are some examples of complete requests that can be used for testing.
This ingests a single event into a repository X using X's ingest token:
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/hec \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "event": "Repo Ingest Test Event" }'
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/hec \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "event": "Repo Ingest Test Event" }'
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/hec ^
-H "Authorization: Bearer $INGEST_TOKEN" ^
-H "Content-Type: application/json" ^
-d '{ "event": "Repo Ingest Test Event" }'
curl.exe -X POST
-H "Authorization: Bearer $INGEST_TOKEN"
-H "Content-Type: application/json"
-d '{ "event": "Repo Ingest Test Event" }'
"$YOUR_LOGSCALE_URL/api/v1/ingest/hec"
#!/usr/bin/perl
use HTTP::Request;
use LWP;
my $INGEST_TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/hec';
my $json = '{ "event": "Repo Ingest Test Event" }';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
#! /usr/local/bin/python3
import requests
url = '$YOUR_LOGSCALE_URL/api/v1/ingest/hec'
mydata = r'''{ "event": "Repo Ingest Test Event" }'''
resp = requests.post(url,
data = mydata,
headers = {
"Authorization" : "Bearer $INGEST_TOKEN",
"Content-Type" : "application/json"
}
)
print(resp.text)
const https = require('https');
const data = JSON.stringify(
{ "event": "Repo Ingest Test Event" }
);
const options = {
hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/hec',
path: '/graphql',
port: 443,
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Content-Length': data.length,
Authorization: 'BEARER ' + process.env.TOKEN,
'User-Agent': 'Node',
},
};
const req = https.request(options, (res) => {
let data = '';
console.log(`statusCode: ${res.statusCode}`);
res.on('data', (d) => {
data += d;
});
res.on('end', () => {
console.log(JSON.parse(data).data);
});
});
req.on('error', (error) => {
console.error(error);
});
req.write(data);
req.end();
The repository ingest token will have a format similar to
989c71b6-577c-4387-8db1-04ab6a94fb87
. If you're running
LogScale locally your URL might be
localhost:3000/humio
.
This ingests a single event into repository X using a system-level ingest token, and an organization ID:
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/hec \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "event": "System Ingest Test Event", "index": "X", "organization": "$ORGANIZATION_ID" }'
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/hec \
-H "Authorization: Bearer $INGEST_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "event": "System Ingest Test Event", "index": "X", "organization": "$ORGANIZATION_ID" }'
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/ingest/hec ^
-H "Authorization: Bearer $INGEST_TOKEN" ^
-H "Content-Type: application/json" ^
-d '{ "event": "System Ingest Test Event", "index": "X", "organization": "$ORGANIZATION_ID" }'
curl.exe -X POST
-H "Authorization: Bearer $INGEST_TOKEN"
-H "Content-Type: application/json"
-d '{ "event": "System Ingest Test Event", "index": "X", "organization": "$ORGANIZATION_ID" }'
"$YOUR_LOGSCALE_URL/api/v1/ingest/hec"
#!/usr/bin/perl
use HTTP::Request;
use LWP;
my $INGEST_TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/ingest/hec';
my $json = '{ "event": "System Ingest Test Event", "index": "X", "organization": "$ORGANIZATION_ID" }';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $INGEST_TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
#! /usr/local/bin/python3
import requests
url = '$YOUR_LOGSCALE_URL/api/v1/ingest/hec'
mydata = r'''{ "event": "System Ingest Test Event", "index": "X", "organization": "$ORGANIZATION_ID" }'''
resp = requests.post(url,
data = mydata,
headers = {
"Authorization" : "Bearer $INGEST_TOKEN",
"Content-Type" : "application/json"
}
)
print(resp.text)
const https = require('https');
const data = JSON.stringify(
{ "event": "System Ingest Test Event", "index": "X", "organization": "$ORGANIZATION_ID" }
);
const options = {
hostname: '$YOUR_LOGSCALE_URL/api/v1/ingest/hec',
path: '/graphql',
port: 443,
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Content-Length': data.length,
Authorization: 'BEARER ' + process.env.TOKEN,
'User-Agent': 'Node',
},
};
const req = https.request(options, (res) => {
let data = '';
console.log(`statusCode: ${res.statusCode}`);
res.on('data', (d) => {
data += d;
});
res.on('end', () => {
console.log(JSON.parse(data).data);
});
});
req.on('error', (error) => {
console.error(error);
});
req.write(data);
req.end();
Note
Make sure you take steps to set the organization ID if using the code examples provided.
The system ingest token will have a format similar to
grBqk3PRbxxaE77UKpqFl4IJm1i4ciGn~TQKczelOIGBo93pGBxugG3wImAB1a3vFKy30G05vxhvu
,
the organization ID will have a format similar to
aNInS0WHvORBcymQTp0HoLIKYDygBiwo
.