Manage Repositories in the Cluster

Resurrect Deleted Segments

This API endpoint allows undoing delete of recently deleted segments by lowering retention settings in particular. The endpoint will reset the "tombstone" on deleted segments internally, and restore all files that are still available in a bucket using Bucket Storage.

By default LogScale will keep files in bucket storage for seven (7) days longer than the retention settings require. This means that extending retention by seven days and then using this API can add approximately the latest seven days worth of deleted events.

In the case of a retention being lowered from the proper value to something very small, the 7 days allows you up to 7 days of time to revert the change to retention settings and invoke this API endpoint before any events are lost. Invoking this endpoint requires root access.

Description Restore recently deleted segments.
MethodPOST /api/v1/repositories/viewname/resurrect-deleted-segments
Request Data 
Authentication Requiredyes
Path ArgumentsDescriptionData typeRequired?
viewname The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$VIEWNAME/resurrect-deleted-segments \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$VIEWNAME/resurrect-deleted-segments \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$VIEWNAME/resurrect-deleted-segments ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
"$YOUR_LOGSCALE_URL/api/v1/respositories/$VIEWNAME/resurrect-deleted-segments"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$VIEWNAME/resurrect-deleted-segments';
my $json = '';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$VIEWNAME/resurrect-deleted-segments'
mydata = r''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');



const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/respositories/$VIEWNAME/resurrect-deleted-segments',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

Manage Data Sources Limits

LogScale supports control of the default number of datasources limit for each repository. This is configured through the MAX_DATASOURCES environment variable.

Show Datasources Limit

Description See the current default limit on the number of datasources.
MethodGET /api/v1/repositories/repository/max-datasources
Authentication Requiredyes
Path ArgumentsDescriptionData typeRequired?
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
Mac OS or Linux (curl)
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X GET 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources';
my $req = HTTP::Request->new("GET", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources'

resp = requests.get(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

let request = https.get('$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Update Datasources Limit

The REST API endpoint max-datasources allows setting a new individual limit for the number of data sources on each repository.

Description Set a new value for the maximum number of allowed datasources.
MethodPOST /api/v1/repositories/repository/max-datasources/number
Request Data 
Authentication Requiredyes
Path ArgumentsDescriptionData typeRequired?
number Number of allowed datasources. integerrequired
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
"$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources';
my $json = '';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources'
mydata = r''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');



const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/max-datasources',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

Delete Datasources

The deleting datasources endpoint marks the datasource for deletion, internally triggering delete of all segments in the datasource.

Description Marks the datasource for deletion, triggering deletion of all segments in the datasource.
MethodDELETE /api/v1/repositories/repository/datasources/datasourceid
Authentication Requiredyes
Path ArgumentsDescriptionData typeRequired?
datasourceid The datasource ID number. integerrequired
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
Mac OS or Linux (curl)
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X DELETE 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID';
my $req = HTTP::Request->new("DELETE", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID'

resp = requests.delete(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

let request = https.delete('$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Importing a Repository from Another LogScale Instance (BETA)

Removed: Beta feature removed

This feature is removed starting from LogScale version 1.79.0.

You can import users, dashboards, and segments files from another LogScale instance. You need to get a copy of the /data/humio-data/global-data-snapshot.json from the origin server.

You also need to copy the segments files that you want to import. These must be placed in the folder /data/humio-data/ready_for_import_dataspaces using the following structure: /data/humio-data/ready_for_import_dataspaces/dataspace_$ID

You should copy the files for the repository to the server into another folder while the copying is happening, and then move it to the proper name once it's ready. Note the name of the directory uses the internal ID of the repository, which is the directory name in the source system.

The folder /data/humio-data/ready_for_import_dataspaces must be read+writeable for the humio-user running the server, as it moves the files to another directory and deletes the imported files when it is done with them, one at a time.

Example (note that you need both NAME and ID of the repository):

shell
$ NAME="target-repo-name"
$ SRC_NAME="source-repo-name"
$ ID="my-repository-id"
$ sudo mkdir /data/humio-data/ready_for_import_dataspaces
$ sudo mv /data/from-other/dataspace_$ID /data/humio-data/ready_for_import_dataspaces
$ sudo chown -R humio /data/humio-data/ready_for_import_dataspaces/
$ curl -XPOST \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer $TOKEN" \
     -d @from-other-global-data-snapshot.json \
     "$YOUR_LOGSCALE_URL/api/v1/importrepository/$NAME?importSegmentFilesOnly=true&importFromName=$SRC_NAME"

The POST imports the metadata, such as users and dashboards, and moves the repository folder from /data/humio-data/ready_for_import_dataspaces to /data/humio-data/import. A low-priority background task will then import the actual segments files from that point on.

You can start using the ingest tokens and other data, that are not actual log-events as soon as the POST has completed.

You can run the POST starting the import of the same repository more than once. This is useful if you wish to import only a fraction of the data files at first, but get all the metadata. When you rerun the POST, the metadata is inserted/updated again, if it no longer matches only. The new repository files will get copied at that point in time.

If you re-import the same segment files more than once, you get duplicate events in your target repository.

Note

We strongly recommend that you import to a new repository, at least until you have practiced this procedure. Having the newly imported data in a separate repository makes it easy to delete and try again, while deleting data from an existing repository will be very time consuming and error prone.

Configure Auto-Sharding for High-Volume Data Sources

A data source is ultimately bounded by the volume that one CPU thread can manage to compress and write to the filesystem. This is typically in the 1-4 TB/day range. To handle more ingest traffic from a specific data source, you need to provide more variability in the set of tags. But in some cases, it may not be possible or desirable to adjust the set of tags or tagged fields in the client. To solve this case, LogScale supports adding a synthetic tag, that is assigned a random number for each (small bulk) of events.

LogScale supports detecting if there is a high load on a data source, and automatically triggers this auto-sharding on the data sources. You will see this happening on "fast" data sources, typically if more than 2 TB/day is delivered to a single data source. The events then get an extra tag, #humioAutoShard that is assigned a random integer value.

This is configured through the setting AUTOSHARDING_TRIGGER_DELAY_MS, which is compared to the time an event spends in the ingest pipeline inside LogScale. When the delay threshold is exceeded, the number of shards on that data source (a combination of tags) is doubled. The default value for AUTOSHARDING_TRIGGER_DELAY_MS is 3,600,000 ms (3,600 seconds). The delay needs to be increasing as well, as noted two times in a row at an interval of AUTOSHARDING_CHECKINTERVAL_MS which defaults to 20,000 (20 seconds).

The setting AUTOSHARDING_MAX controls how many different data sources get created this way for each "real" data source. The default value is 128. Internally, the number of cores and hosts reading from the ingest queue is also taken into consideration, aiming at not creating more shards than the total number of cores in the ingest part of the cluster.

Configure Sticky Auto-Sharding for High-Volume Data Sources

In some use cases, it makes sense to disable the automatic tuning and manage these settings using the API. Set AUTOSHARDING_MAX to 1 to make the system never increase the number of autoshards of data sources, then use the API to set sticky autosharding settings on the selected data sources that require it. The sticky settings are not limited by the AUTOSHARDING_MAX configuration.

Show Autosharding Settings

Description Show the autosharding settings for a datasource.
MethodGET /api/v1/repositories/repository/datasources/datasourceid/autosharding
Authentication Requiredno
Path ArgumentsDescriptionData typeRequired?
datasourceid The datasource ID number. integerrequired
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed

To show the autosharding settings for a specific datasource, run:

Mac OS or Linux (curl)
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X GET 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding';
my $req = HTTP::Request->new("GET", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding'

resp = requests.get(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

let request = https.get('$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Update Autosharding

To update the autosharding settings for a specific datasource, run:

Description Update the autosharding settings for a datasource.
MethodPOST /api/v1/repositories/repository/datasources/datasourceid/autosharding/number
Request Data 
Authentication Requiredno
Path ArgumentsDescriptionData typeRequired?
datasourceid The datasource ID number. integerrequired
number Number of autoshards for a datasource. Not limited by other configurations. integerrequired
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
"$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding';
my $json = '';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding'
mydata = r''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');



const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

To update to a specific number of autoshards run the query as shown below:

shell
$ curl -XPOST -H "Authorization: Bearer $TOKEN" "$YOUR_LOGSCALE_URL/api/v1/repositories/$REPOSITORY_NAME/datasources/$DATASOURCEID/autosharding?number=7"

Delete Autosharding

Description Delete the autosharding settings for a datasource.
MethodDELETE /api/v1/repositories/repository/datasources/datasourceid/autosharding
Authentication Requiredno
Path ArgumentsDescriptionData typeRequired?
datasourceid The datasource ID number. integerrequired
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed

To delete the autosharding settings for a specific datasource, run:

Mac OS or Linux (curl)
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X DELETE 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding';
my $req = HTTP::Request->new("DELETE", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding'

resp = requests.delete(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

let request = https.delete('$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/autosharding', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Setup Grouping of Tags

Important

The GraphQL interface repository() is the preferred method for updating tag groupings.

Note

This is an advanced feature.

Tags are the fields with a prefix of #. They are used internally to do sharding of data into smaller streams. A data source is created for every unique combination of tag values set by the clients (such as log shippers). LogScale will reject ingested events once a certain number of datasources get created. The limit is currently 10,000 datasources per repository.

Show Tag Grouping

Description List repositories grouped by tags in the cluster.
MethodGET /api/v1/repositories/repository/taggrouping
Authentication Requiredyes
Path ArgumentsDescriptionData typeRequired?
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
http
GET    /api/v1/repositories/$REPOSITORY_NAME/taggrouping
Mac OS or Linux (curl)
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X GET 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping';
my $req = HTTP::Request->new("GET", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping'

resp = requests.get(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

let request = https.get('$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

LogScale recommends that you only use the parser as a tag in the field #type.

Using more tags may speed up queries on large data volumes, but only works on a bounded value-set for the tag fields. The speed-up only affects queries prefixed with #tag=value pairs that significantly filter out input events.

Update Tag Grouping

Note

If you are using a hosted LogScale instance while following this procedure, please contact support if you wish to add grouping rules to your repository.

Adding a new set of rules using POST replaces the current set. The previous sets are kept, and if a previous one matches, then the previous one is reused. The previous rules are kept in the system but may be deleted by LogScale once all data sources referring them has been deleted (through retention settings).

Description Apply new tag grouping rules to repositories in the cluster.
MethodPOST /api/v1/repositories/repository/taggrouping
Request Data 
Authentication Requiredyes
Path ArgumentsDescriptionData typeRequired?
repository The repository name stringrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
http
POST   /api/v1/repositories/$REPOSITORY_NAME/taggrouping

For some use cases, such as having the "client IP" from an access log as a tag, too many different tags will arise. For such a case, it is necessary to either stop having the field as a tag, or create a grouping rule on the tag field. Existing data is not rewritten when grouping rules are added or changed. Changing the grouping rules will thus in itself create more data sources.

Example: Setting the grouping rules for repository $REPOSITORY_NAME to hash the field #host into 8 buckets, and #client_ip into 10 buckets. Note how the field names do not include the # prefix in the rules.

Mac OS or Linux (curl)
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'
Mac OS or Linux (curl) One-line
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'
Windows Cmd and curl
shell
curl -v -X POST $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json" ^
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'
Windows Powershell and curl
powershell
curl.exe -X POST 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
    -d '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'
"$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping';
my $json = '{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]';
my $req = HTTP::Request->new("POST", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
$req->content( $json );
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping'
mydata = r'''{"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]'''

resp = requests.post(url,
                     data = mydata,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

const data = JSON.stringify(
    {"field":"host","modulus": 8}, {"field":"client_ip","modulus": 10} ]
);


const options = {
  hostname: '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/taggrouping',
  path: '/graphql',
  port: 443,
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': data.length,
    Authorization: 'BEARER ' + process.env.TOKEN,
    'User-Agent': 'Node',
  },
};

const req = https.request(options, (res) => {
  let data = '';
  console.log(`statusCode: ${res.statusCode}`);

  res.on('data', (d) => {
    data += d;
  });
  res.on('end', () => {
    console.log(JSON.parse(data).data);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(data);
req.end();

When using grouped tags in the query field, you can expect to get a speed-up of approximately the modulus compared to not including the tags in the query, provided you use an exact match on the field. If you use a wildcard (*) in the value for the grouped tag, the implementation currently scans all data sources that have a non-empty value for that field and filter the events to only get the results that match the wildcard pattern.

For non-grouped tag fields, it is efficient to use a wildcard at either end of the value string to match.

LogScale also supports auto-grouping of tags using the configuration variables MAX_DISTINCT_TAG_VALUES (default is 1000) and TAG_HASHING_BUCKETS (default is 32). LogScale checks the number of distinct values for each key in each tag combination against MAX_DISTINCT_TAG_VALUES at regular intervals. If this threshold is exceeded, a new grouping rule is added with the modulus set to the value set in TAG_HASHING_BUCKETS, but only if there is no rule for that tag key. You can thus configure rules using the API above and decide the number of buckets there. This is preferable to auto-detecting, as the auto-detection works after the fact and thus leaves a large number of unused data sources that will need to get deleted by retention at some point. The auto-grouping support is meant as a safety measure to avoid suddenly creating many data sources by mistake for a single tag key.

Delete Segment from Metadata

When a segment file has been deleted (or lost) and all copies of the segment file have been lost, the existence of the file needs to be removed from LogScale's metadata.

To delete the segment file metdata, a DELETE command can be sent to the segment deletion endpoint.

Description Delete a segment file from metadata for a datasource.
MethodDELETE /api/v1/repositories/repository/datasources/datasourceid/segments/segmentid
Authentication Requiredyes
Path ArgumentsDescriptionData typeRequired?
datasourceid The datasource ID number. integerrequired
repository The repository name stringrequired
segmentid Segment ID to delete for a datasource. integerrequired
Return Codes
200 Request complete
400 Bad authentication
500 Request failed
http
/api/v1/repositories/$REPOSITORY_NAME/datasources/$DATASOURCEID/segments/$SEGMENTID
Mac OS or Linux (curl)
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Mac OS or Linux (curl) One-line
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json"
Windows Cmd and curl
shell
curl -v -X DELETE $YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID ^
    -H "Authorization: Bearer $TOKEN" ^
    -H "Content-Type: application/json"
Windows Powershell and curl
powershell
curl.exe -X DELETE 
    -H "Authorization: Bearer $TOKEN"
    -H "Content-Type: application/json"
Perl
perl
#!/usr/bin/perl

use HTTP::Request;
use LWP;
my $TOKEN = "TOKEN";
my $uri = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID';
my $req = HTTP::Request->new("DELETE", $uri );
$req->header("Authorization" => "Bearer $TOKEN");
$req->header("Content-Type" => "application/json");
my $lwp = LWP::UserAgent->new;
my $result = $lwp->request( $req );
print $result->{"_content"},"\n";
Python
python
#! /usr/local/bin/python3

import requests

url = '$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID'

resp = requests.delete(url,
                     headers = {
   "Authorization" : "Bearer $TOKEN",
   "Content-Type" : "application/json"
}
)

print(resp.text)
Node.js
javascript
const https = require('https');

let request = https.delete('$YOUR_LOGSCALE_URL/api/v1/respositories/$REPOSITORY/datasources/$DATASOURCEID/segments/$SEGMENTID', (res) => {
  if (res.statusCode !== 200) {
    console.error(`Error from server. Code: ${res.statusCode}`);
    res.resume();
    return;
  }

  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('close', () => {
    console.log('Response:');
    console.log(JSON.parse(data));
  });
});

    return(undef,undef);
}

Doing this removes the meta data for that file, so that the system stops raising an error about the missing data when doing queries.

Important

This is not a typical scenario, but may be required after, e.g., losing files from the bucket trusted with the files for the cluster.