How-To: Redacting Data from a Repository

There may be certain data that you don't want stored in a LogScale repository, maybe not whole events, but specific text contained in events. For example, someone's password might have been inadvertently logged and stored in plain text in a repository. Another example could be that someone under the European GDPR has requested all information on them not be saved.

The best practice regarding these situations is either not to send the data to LogScale, or to have LogScale not store the data. For the first preventive measure, you might configure your log shipper to filter out passwords and other sensitive data.

For the second measure, you could configure the parser you assign to a datasource so as not to record specific data. You might configure a parser like so:

logscale
parseJson()
|
case {
   data=sensitive
| dropEvent();
   password=*
| replace(field=password,with="XXXXXX");
}

These measures should help greatly to reduce the amount of sensitive data that is recorded. However, there may still be data that makes it through and is stored in a repository. For those, you'll have to redact the specific text.

Solution

You can't use the LogScale User Interface to delete text contained in an event entry in a repository. Instead, you'll have to do this from the command-line, using the Redact Events API.

Below is an example of how to do this using the curl command:

shell
$ curl -v "https://$YOUR_LOGSCALE_URL/api/v1/repositories/$REPO_NAME/deleteevents" \
   -X POST \
   -H "Authorization: Bearer $TOKEN" \
   -H "Content-Type: application/json" \
   -d '{"queryString": "password=*", "startTime": 1551074900671, "endTime": 1612219536721}'

In this example, you would replace $YOUR_LOGSCALE_URL either the URL to your own server, or the URL to the LogScale Cloud environment you're using. See LogScale URLs & Endpoints.

You will also need to replace $REPO_NAME in the example above with the name of the repository from which you want to delete data.

API Tokens

Figure 13. API Tokens


The last variable you would replace is $TOKEN. Replace it with the default API token for the repository. To find this token, go to the Settings tab in the LogScale User Interface. Click on API Tokens to see a list of your tokens (see Figure 9 here). You can copy the default one from the panel there and paste it into the example above, or create an environment variable by which you would access it.

As for the rest of the example, you will need to adjust the last line, preceded with the delete option (i.e., -d). Change password=* to the text for which it should search. Notice the wildcard (i.e., *). That will have it include the password. This will result in both the key and value being deleted. Be sure to change the start and end times to the range of time on which to search the repository for that query string.