Using the join() Function

Using the `join()` Function

`join()` Syntax

The join() function's signature looks like this

logscale Syntax

join({subquery}, field=arg1, key=arg2)

The unnamed subquery argument, and the named field and key arguments, are required. The named repo, include, and mode parameters are optional. There are other options available. For more information, see join().

First, let's consider a join query against a single repository. In that scenario, there is no need to define the optional repo argument. The query will look like this

logscale Syntax

primary query
| join({subquery}, field=arg1, key=arg2)

To summarize the join() function operation:

The primary query is any valid LogScale query. This provides the first set of results you wish to work with.
The arguments you pass to the join() function define what results should be combined and how.
The subquery is any valid LogScale query. This provides the second set of results you wish to combine with those from your primary query.
The field parameter value (arg1) defines the field name in the primary query. This value will be used to match data in the subquery.
The key parameter's value (arg2) defines the field name in the subquery. This value will be used to match data in the primary query.

If the field names being used in field and key are identical, you can use the simplified syntax:

logscale Syntax

primary query
| field =~ join({subquery})

In the above example, the field argument would be the actual field name. For example, if both queries share a field named host that you wish to join on, you would use:

logscale Syntax

primary query
| host =~ join({subquery})

Using `start` and `end` Times

When running a join() query, the start and end times for the span of the subquery in the join can be be specified using the start and end parameters. This bounds the subquery within the time range given.

Using these two options, the parameters are relative to the start and end time of the primary query. For example, if the primary query is executed with a timespan of a the current day, setting start to 2h will be 2 hours before the start of the current day.

The following query, if run with the default time of 1 day until now (12:20 on 23rd October), will not show users not known about 5 hours before the query was executed:

logscale

type = "authentication"
| join({type != "authentication"},field=user,key=authuser,include=[port],start=24h,end=5h)
| groupBy(user,function=[max("@timestamp"),min("@timestamp")])
| formatTime(format="%m/%d/%Y %H:%M:%S %a", field=_max,as="maxtime")
| formatTime(format="%m/%d/%Y %H:%M:%S %a", field=_min,as="mintime")

This creates this output:

user	_max	_min	maxtime	mintime
bsmith	1729682282587	1729663215893	10/23/2024 11:18:02 Wed	10/23/2024 06:00:15 Wed
tjones	1729682282587	1729663215893	10/23/2024 11:18:02 Wed	10/23/2024 06:00:15 Wed

Removing the start and end parameters gives the full set:

user	_max	_min	maxtime	mintime
bsmith	1729682282587	1729663215893	10/23/2024 11:18:02 Wed	10/23/2024 06:00:15 Wed
tjones	1729682282587	1729663215893	10/23/2024 11:18:02 Wed	10/23/2024 06:00:15 Wed
tmcdonald	1729682282587	1729663215893	10/23/2024 11:18:02 Wed	10/23/2024 06:00:15 Wed

Joining Two Repositories

In addition to joining two queries against a single repository, the join() function can also be used to return a result set from more than two repositories. To do so, use the optional repo parameter to define the repository (as arg3) you want the subquery to run against.

logscale Syntax

primary query
| join({subquery}, field=arg1, key=arg2, repo=arg3)

Note

The user running the join() query must have permissions on both repositories.

Data Analysis Overview

LogScale User Interface

Repositories & Views

Parsing Data

Searching Data

Writing Queries

Query Language Syntax

Query Joins and Lookups

Query Functions

Dashboards & Widgets

Automation

Template Language

Keyboard Shortcuts