Manage Repositories in the Cluster

Resurrect Deleted Segments

This API endpoint allows undoing delete of recently deleted segments by lowering retention settings in particular. The endpoint will reset the "tombstone" on deleted segments internally, and restore all files that are still available in a bucket using Bucket Storage.

By default LogScale will keep files in bucket storage for seven (7) days longer than the retention settings require. This means that extending retention by seven days and then using this API can add approximately the latest seven days worth of deleted events.

In the case of a retention being lowered from the proper value to something very small, the 7 days allows you up to 7 days of time to revert the change to retention settings and invoke this API endpoint before any events are lost. Invoking this endpoint requires root access.

Description	Restore recently deleted segments.
Method	`POST /api/v1/repositories/viewname/resurrect-deleted-segments`
Request Data
Authentication Required	yes
Path Arguments	Description	Data type	Required?
`viewname`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

Mac OS or Linux (curl)

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$VIEWNAME/resurrect-deleted-segments \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$VIEWNAME/resurrect-deleted-segments \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$VIEWNAME/resurrect-deleted-segments ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$VIEWNAME/resurrect-deleted-segments"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$VIEWNAME/resurrect-deleted-segments';

my $json = '';
my $req = HTTP::Request->new("POST", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");

$req->content( $json );

my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$VIEWNAME/resurrect-deleted-segments'
mydata = r''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');



const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/repositories/$VIEWNAME/resurrect-deleted-segments',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

Manage Data Sources Limits

LogScale supports control of the default number of datasources limit for each repository.

Show Datasources Limit

Description	See the current default limit on the number of datasources.
Method	`GET /api/v1/repositories/repository/max-datasources`
Authentication Required	yes
Path Arguments	Description	Data type	Required?
`repository`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

Mac OS or Linux (curl)

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X GET 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources';

my $req = HTTP::Request->new("GET", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");


my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources'

resp = requests.get(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');

let request = https.get('$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Update Datasources Limit

The REST API endpoint max-datasources enables setting a new limit, per repository, for the maximum number of datasources.

Important

Do NOT change the maximum number of datasources unless you know your available hardware can support the change. If the limit is raised beyond what the hardware can support, that is, beyond what can be stored in memory, and then a large number of datasources is created, the cluster may fail and become unrecoverable.

For more information on creating tags during parsing, see Parsing Event Tags, and for information tags and datasources, see Tag Fields and Datasources.

Description	Set a new value for the maximum number of allowed datasources.
Method	`POST /api/v1/repositories/repository/max-datasources`
Request Data
Authentication Required	yes
Path Arguments	Description	Data type	Required?
`repository`	The repository name	`string`	required
Query Arguments	Description	Data type	Required?
`number`	Maximum number of datasources	`integer`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

To update the maximum number of datasources, supply the number to the endpoint:

shell

DATASOURCE_MAX=1000

Mac OS or Linux (curl)

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources?number=$DATASOURCE_MAX \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources?number=$DATASOURCE_MAX \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources?number=$DATASOURCE_MAX ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources?number=$DATASOURCE_MAX"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources?number=$DATASOURCE_MAX';

my $json = '';
my $req = HTTP::Request->new("POST", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");

$req->content( $json );

my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources?number=$DATASOURCE_MAX'
mydata = r''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');



const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/max-datasources?number=$DATASOURCE_MAX',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

Delete Datasources

The deleting datasources endpoint marks the datasource for deletion, internally triggering delete of all segments in the datasource.

Description	Marks the datasource for deletion, triggering deletion of all segments in the datasource.
Method	`DELETE /api/v1/repositories/repository/datasources/datasourceid`
Authentication Required	yes
Path Arguments	Description	Data type	Required?
`datasourceid`	The datasource ID number.	`integer`	required
`repository`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

Mac OS or Linux (curl)

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X DELETE 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID';

my $req = HTTP::Request->new("DELETE", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");


my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID'

resp = requests.delete(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');

let request = https.delete('$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Configure Auto-Sharding for High-Volume Data Sources

A data source is ultimately bounded by the volume that one CPU thread can manage to compress and write to the filesystem. This is typically in about 190 GB/day. To handle more ingest traffic from a specific data source, you need to provide more variability in the set of tags. But in some cases, it may not be possible or desirable to adjust the set of tags or tagged fields in the client. To solve this case, LogScale supports adding a synthetic tag, that is assigned a random number for each (small bulk) of events.

LogScale supports detecting if there is a high load on a data source, and automatically triggers this auto-sharding on the data sources. You will see this happening on "fast" data sources, typically if more than 190 GB/day is delivered to a single data source. The events then get an extra tag, #humioAutoShard that is assigned a random integer value.

Starting from LogScale v1.152, auto-sharding is handled through rate monitoring of the ingest flow. This is configured through the dynamic configuration option TargetMaxRateForDatasource with a default of 2 MB/s (about 190 GB/day). In previous LogScale versions, the configuration was handled by ingest delay through AUTOSHARDING_TRIGGER_DELAY_MS and AUTOSHARDING_CHECKINTERVAL_MS configuration variables, now dismissed.

The setting AUTOSHARDING_MAX controls how many different data sources get created this way for each "real" datasource. The default value is 1,024. From version 1.186, the default value is 131,072. From version 1.206, the default value is 12,288.

Configure Sticky Auto-Sharding for High-Volume Data Sources

In some use cases, it makes sense to disable the automatic tuning and manage these settings using the API. Set AUTOSHARDING_MAX to 1 to make the system never increase the number of autoshards of data sources, then use the API to set sticky autosharding settings on the selected data sources that require it. The sticky settings are not limited by the AUTOSHARDING_MAX configuration.

Show Autosharding Settings

Description	Show the autosharding settings for a datasource.
Method	`GET /api/v1/repositories/repository/datasources/datasourceid/autosharding`
Authentication Required	no
Path Arguments	Description	Data type	Required?
`datasourceid`	The datasource ID number.	`integer`	required
`repository`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

To show the autosharding settings for a specific datasource, run:

Mac OS or Linux (curl)

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X GET 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding';

my $req = HTTP::Request->new("GET", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");


my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding'

resp = requests.get(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');

let request = https.get('$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Update Autosharding

To update the autosharding settings for a specific datasource, run:

Description	Update the autosharding settings for a datasource.
Method	`POST /api/v1/repositories/repository/datasources/datasourceid/autosharding/number`
Request Data
Authentication Required	no
Path Arguments	Description	Data type	Required?
`datasourceid`	The datasource ID number.	`integer`	required
`number`	Number of autoshards for a datasource. Not limited by other configurations.	`integer`	required
`repository`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

Mac OS or Linux (curl)

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding';

my $json = '';
my $req = HTTP::Request->new("POST", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");

$req->content( $json );

my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding'
mydata = r''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');



const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

To update to a specific number of autoshards run the query as shown below:

shell

$ curl -XPOST -H "Authorization: Bearer $TOKEN" "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY_NAME/datasources/$DATASOURCEID/autosharding?number=7"

Delete Autosharding

Description	Delete the autosharding settings for a datasource.
Method	`DELETE /api/v1/repositories/repository/datasources/datasourceid/autosharding`
Authentication Required	no
Path Arguments	Description	Data type	Required?
`datasourceid`	The datasource ID number.	`integer`	required
`repository`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

To delete the autosharding settings for a specific datasource, run:

Mac OS or Linux (curl)

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X DELETE 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding';

my $req = HTTP::Request->new("DELETE", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");


my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding'

resp = requests.delete(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');

let request = https.delete('$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Setup Grouping of Tags

Important

The GraphQL interface repository() is the preferred method for updating tag groupings.

Note

This is an advanced feature.

Tags are the fields with a prefix of #. They are used internally to do sharding of data into smaller streams. A data source is created for every unique combination of tag values set by the clients (such as log shippers). LogScale will reject ingested events once a certain number of datasources get created. The limit is currently 10,000 datasources per repository.

Show Tag Grouping

Description	List repositories grouped by tags in the cluster.
Method	`GET /api/v1/repositories/repository/taggrouping`
Authentication Required	yes
Path Arguments	Description	Data type	Required?
`repository`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

http

GET    /api/v1/repositories/$REPOSITORY_NAME/taggrouping

Mac OS or Linux (curl)

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X GET 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping';

my $req = HTTP::Request->new("GET", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");


my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping'

resp = requests.get(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');

let request = https.get('$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

LogScale recommends that you only use the parser as a tag in the field #type.

Using more tags may speed up queries on large data volumes, but only works on a bounded value-set for the tag fields. The speed-up only affects queries prefixed with #tag=value pairs that significantly filter out input events.

Update Tag Grouping

Note

If you are using a hosted LogScale instance while following this procedure, please contact support if you wish to add grouping rules to your repository.

Adding a new set of rules using POST replaces the current set. The previous sets are kept, and if a previous one matches, then the previous one is reused. The previous rules are kept in the system but may be deleted by LogScale once all data sources referring them has been deleted (through retention settings).

Description	Apply new tag grouping rules to repositories in the cluster.
Method	`POST /api/v1/repositories/repository/taggrouping`
Request Data
Authentication Required	yes
Path Arguments	Description	Data type	Required?
`repository`	The repository name	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

http

POST   /api/v1/repositories/$REPOSITORY_NAME/taggrouping

For some use cases, such as having the "client IP" from an access log as a tag, too many different tags will arise. For such a case, it is necessary to either stop having the field as a tag, or create a grouping rule on the tag field. Existing data is not rewritten when grouping rules are added or changed. Changing the grouping rules will thus in itself create more data sources.

Example: Setting the grouping rules for repository $REPOSITORY_NAME to hash the field #host into 8 buckets, and #client_ip into 10 buckets. Note how the field names do not include the # prefix in the rules.

Mac OS or Linux (curl)

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'

Mac OS or Linux (curl) One-line

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'

Windows Cmd and curl

shell

curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json" ^
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'

Windows Powershell and curl

powershell

curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping';

my $json = '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]';
my $req = HTTP::Request->new("POST", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");

$req->content( $json );

my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping'
mydata = r'''{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');

const data = JSON.stringify(
    {"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]
);


const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/taggrouping',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

When using grouped tags in the query field, you can expect to get a speed-up of approximately the modulus compared to not including the tags in the query, provided you use an exact match on the field. If you use a wildcard (*) in the value for the grouped tag, the implementation currently scans all data sources that have a non-empty value for that field and filter the events to only get the results that match the wildcard pattern.

For non-grouped tag fields, it is efficient to use a wildcard at either end of the value string to match.

LogScale also supports auto-grouping of tags using the configuration variables MAX_DISTINCT_TAG_VALUES (default is 1000) and TAG_HASHING_BUCKETS (default is 32). LogScale checks the number of distinct values for each key in each tag combination against MAX_DISTINCT_TAG_VALUES at regular intervals. If this threshold is exceeded, a new grouping rule is added with the modulus set to the value set in TAG_HASHING_BUCKETS, but only if there is no rule for that tag key. You can thus configure rules using the API above and decide the number of buckets there. This is preferable to auto-detecting, as the auto-detection works after the fact and thus leaves a large number of unused data sources that will need to get deleted by retention at some point. The auto-grouping support is meant as a safety measure to avoid suddenly creating many data sources by mistake for a single tag key.

Mark Segment for Deletion

To mark a segment file for deletion, a DELETE command can be sent to the segment deletion endpoint.

Description	Mark a segment file for deletion.
Method	`DELETE /api/v1/repositories/repository/datasources/datasourceid/segments/segmentid`
Authentication Required	yes
Path Arguments	Description	Data type	Required?
`datasourceid`	The datasource ID number.	`string`	required
`repository`	The repository name	`string`	required
`segmentid`	Segment ID to delete for a datasource.	`string`	required
Return Codes
200	Request complete
400	Bad authentication
500	Request failed

http

/api/v1/repositories/$REPOSITORY_NAME/datasources/$DATASOURCEID/segments/$SEGMENTID

Mac OS or Linux (curl)

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Mac OS or Linux (curl) One-line

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"

Windows Cmd and curl

shell

curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"

Windows Powershell and curl

powershell

curl.exe -X DELETE 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID"

Perl

perl

#!/usr/bin/perl

use HTTP::Request;
use LWP;

my $TOKEN = "TOKEN";

my $uri = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID';

my $req = HTTP::Request->new("DELETE", $uri );

$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");


my $lwp = LWP::UserAgent->new;

my $result = $lwp->request( $req );

print $result->{"_content"},"\n";

Python

python

#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID'

resp = requests.delete(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)

Node.js

javascript

const https = require('https');

let request = https.delete('$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Doing this marks the segment for deletion, and eventually the metadata for the file is removed.

Important

This is not a typical scenario, but may be required after, e.g., losing files from the bucket trusted with the files for the cluster.

APIs

API Authentication

Cluster Management API

Health Check API

Ingest API

Lookup API

Redact Events API

Search API

Packages API

Software Libraries