Repositories & Views

Organize your data into Repositories and combine them using Views

Repositories in LogScale is where you store server logs and metrics. Repositories are organized collections of data with associated storage, enabling you to search and monitor your data in a much more comprehensive way. Often, there will be one physical repository per project or system, each with its own set of users, parsers, saved queries, and dashboards. However, this may vary based on the data volume, user permissions, and other factors.

The following diagram provides an overview of the configuration flow to ingest data using LogScale:

graph LR; A["Install & Configure LogScale"]--> B B["Create a Repository"]--> LogScale C["Configure Data Ingest"]--> D D["Parse & Filter Data"]--> E E["Enrich Data"]--> F F["Query Data"] style B fill:#A6A0D2

Figure 18. Flow


Creating & Configuring Repositories and Views

To start working with LogScale, you'll need to create and configure repositories to manage your data. With views, you can also provide alternative representations of server data — either to limit the access to specific users, to filter out events, or to enable searching multiple repositories. For details on how to do that, see Create Repository or View and Repository and View Settings.

Enriching Event Text via Files

Although it may not be a common method, you can add text to events for search purposes using a CSV file. See the Lookup Files section for more information on this.

Deleting Repositories & Views

To save space or to eliminate no longer needed data, you can delete a repository and related views. For details on how to do this, see Delete Repositories & Views.

If you're new to LogScale, you may want to look at the LogScale Training section of the LogScale Library. Start by reading the Getting Started with LogScale. You might also read through the LogScale Overview pages — and bookmark it as you may want to refer back to it until you're comfortable with the various concepts of LogScale.

Views

Views in LogScale allow you to group together the specific events from one or more repository. Additionally, views can be used to limit access to data by some users: this is useful because in a repository you can't hide or restrict users to specific data.

Filtering

By default views contain all data from their connected repositories. This is not always what you want and that is why you can apply a filter to each connection.

A filter will reduce or transform the data before it produces the final search result.

A filter is a normal query expression and you can use the same functions that you use when writing queries. The query prefix is a filter, you cannot use aggregate functions like groupBy() or count() in a query prefix.

For an example of using a view with filters, we have a view configured with two connections with a filter applied to each:

Repository Filter
accesslogs method=GET
analytics loglevel=INFO

Now, if you run the following query within the view:

logscale
ip = 158.191.19.12 
| groupBy(url)

The filters work as a query prefix, that is, each repository in the view has the query prefix executed before the statements executed within the view. The result is:

  • In the accesslogs repository the query executed is:

    logscale
    method=GET
    | ip = 158.191.19.12

    Which equates to filter the events where method equals GET and the ip is 158.191.19.12

  • In the analytics repository the query executed is:

    logscale
    loglevel=INFO
    | ip = 158.191.19.12

    Querying for loglevel events at the information level with the matching IP address.

  • Within the view, the events are then summarized by the url field showing a count of the entries matching each unique URL address.

By using a view, we are able to look for URLs accessed through raw data and analytics information for a specific IP address:

graph LR; A[Repo: accesslogs] -->|"method = GET | ip = 158.191.19.12"| B("View") C[Repo: analytics] -->|"loglevel = INFO | ip = 158.191.19.12"| B B -->|"groupBy(url)"| D{Client}

One repository per Service

Say you have a micro-service setup and you store all logs from all applications in a single repository, let's call it acme-project. It can become cumbersome to examine logs from each individual service, and their log formats may be very diverse.

First you would have to filter your results down to only include logs from your target service and write something like:

logscale
#service=login-service 
| ...

And you would have to do it at the beginning of every single query. Instead you can create a specialized view for each service:

Log Type Repository Filter
Nginx Logs acme-project #service=nginx
PostgreSQL Logs acme-project #service=postgres
iOS App Analytics acme-project #service=app and eventType=analytics

In this example we create three views that all draw their data from a single repository.

Restricting Access to Repository Subsets

Say your system produces logs in several regions, but some of the people who have search access should only be able to see logs for their respective region.

It is easy to select a subset of the logs by filtering the results before they reach the user, in this case limiting access to logs Germany:

Repository Filter
website country = " DE "
db ip.geo = " DE "

In this example, we're dealing with two repositories.