How-To: Delete Data in Bulk
Last Updated: 2021-07-05
You may find that you need to delete or redact information in your stored data that is considered sensitive and needs to be removed after parsing. If you want to remove sensitive information that has already been parsed and stored, there are some methods available: you can adjust the retention settings of LogScale to delete old data, automatically; or you can delete the data manually either using GraphQL or from the command-line.
Solution: Adjust Retention

Figure 307. Data Retention
You can configure LogScale to expire old data — which will cause the data to be automatically removed. This can be achieved by adjusting data retention. It's simple to make the adjustments in the User Interface. Data can be retain based on compressed file sizes, uncompressed file sizes, or on the age of data.
Setting data retention is available when you have LogScale installed on-premise, on your own server or instance. It has not been available, though, on LogScale Cloud accounts. However, it's now a new beta feature that's available to only a few LogScale Cloud accounts.
Although you can set this when running LogScale locally, Cloud accounts are limited to the amount of days requested when the account was established — unless you've requested Support to change it.
This feature allows you to set data retention to a maximum of 365 days (see Figure 1). You can change this value yourself, later. However, if you change it to a lower amount, older data will be deleted.

Figure 308. Organization Statistics
Please note that setting a long retention time means more usage of storage, and therefore will affect the amount you're charged for LogScale Cloud services. You may monitor your usage on the main page of the LogScale User Interface, in the bottom right corner. There you'll see your Organization Statistics, very much like what's shown and highlighted in Figure 2 here.
See the Data Retention documentation for more information.
Solution: Delete with GraphQL API
For a more a more targeted method of removing data, you can use the Redact Events API. This isn't as efficient as setting Data Retention, but it works well enough for one-time, manual deletions.
To use the Redact Events API, you can use the GraphQL. The GraphQL mutation is redactEvents .
To perform the deletion, logging into LogScale and locate the LogScale GraphQL API Explorer associated with your LogScale. It can be found by clicking on the question mark icon near the top right of the User Interface. One of the choices in its pull-down menu should be
.The mutation accepts the following fields:
repositoryName
The name of the repository where the data you want to redact is stored.
start
The start timestamp of the data to be deleted.
end
The end timestamp of the data to be deleted.
query
An optional query to filter the data.
For example, you could enter the following query:
mutation {
redactEvents(
repositoryName: "apache"
start: "2021-04-10T10:15:30.00Z"
end: "2021-04-15T10:15:30.00Z"
query: ""
userMessage: "Testing"
)
}
In this example, the query will delete all data between the specified timestamps in the specified repository. No query has been specified, so all data will be selected.
The userMessage
is optional;
it's a message to record in the audit log for the action.
When the query is executed, the results of the query will appear in the right panel wut the API explorer.
Solution: Redact from Command-Line
Instead of using GraphQL to redact manually, you can do the same from the command-line. To do this, to do the same as the example above, you would do something like the following:
curl -v https://$HUMIO_URL/api/v1/repositories/$REPO_NAME /redactevents \
-X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"repositoryName": "Testeroo", "start": 1612219536721, "end": 1618820157526, "queryString": ""}
In this example, besides adjusting the repository name, start and end
times, and other parameters in the -d
hash, you would
replace the URL and token variables. First, though, notice that the
dates and times are UTC values. You may use normal dates and times,
formatted as shown in the previous section, but only with GraphQL. For
deleting from the command-line, you have to use the UTC values.

Figure 309. API Tokens
In the example here, replace the
$HUMIO_URL
with either the URL
to your own server, or the URL to the LogScale Cloud environment
you're using. You would use
https://cloud.humio.com:443/
for
EU Cloud accounts. For US Cloud accounts, you would use
https://cloud.us.humio.com:443/
.
You would also replace the variable
$TOKEN
with the default API
token for the repository. To find this token, go to the Settings tab
in the LogScale User Interface. Click on API Tokens to see a list of
your tokens (see Figure 5 here). You can copy the default one from the
panel there and paste it into the example above, or create an
environment variable by which you would access it.