Joins two LogScale searches. When joining two searches,
you need to define the keys/fields that are used to match up
results. This is done using the
field=name
or
field=[name,name,...]
parameter. If you want to join on a single field name, you can
use the syntax:
fieldName =~ join(...)
to specify the field.
If the subquery has a different field that you want to match
against, then use the parameter
key=[name1,name2,...]
to designate the names of keys inside the subquery. The value of
keys defaults to the value of field.
join()
is a filter function which in the
default mode=inner
lets the events through that match on the join keys. If you
specify mode=left
then
events that do not match the join key(s) will also be let
through.
If you specify include=[field,
field, ...]
then those fields are extracted from the
result of the subquery, and added to matching events. For events
in the subquery that do not have one or more of the named
include fields, the output will be the empty string.
Using the parameter
max=N
(which defaults
to max=1
) you can specify how
many rows/events are picked up in the subquery. If a subquery
has multiple events with the same join key, then up to max rows
are emitted.
You can use the parameters
start
and
end
to specify an
alternative time interval for the query. The parameter view can
be used to direct the subquery to run in a different repository
or view, and the
live=true|false
parameter can be used to control if the subquery runs as a live
query. The defaults for all these parameters are inherited from
the primary query containing the
join(...)
usage.
The join()
function also has a concept of a
maximum size of the resultset of the inner query specified with
the limit=100000
parameter.
Warning
The join()
function does two passes over
the data and should not be used as part of a live query. The
two passes consist of the primary query and the subquery; as
two separate queries the sets of data on which they are
executed may be different leading to inconsistent results.
When used in a live query, the query will be run in a repeated mode instead, in which the server chooses the repetition interval based on the resources used by the function.
This can impact the liveness of the query, in that long-running repeated queries can be throttled, and thus be less live than expected.
Parameter | Type | Required | Default Value | Description |
---|---|---|---|---|
end | string | optional[a] | End of main query | Specifies either the timestamp relative to the main query's end (for example, end=2h will be two hours before the end of the main query) or an absolute timestamp in milliseconds since UTC. |
field | array of strings | required | Specifies which field in the event (log line) must match the given column value. | |
include | array of strings | optional[a] | none | Specifies columns to include from the subquery. |
key | array of strings | optional[a] | Specifies which fields of the subquery to join on. Defaults to the value of the field parameter. | |
limit | number | optional[a] | 100000 | Specifies the maximum number of rows in the subquery. |
Minimum | 1 | |||
Maximum | 200000 | |||
live | boolean | optional[a] | Same as main query | Control if the subquery runs as live or static query. |
max | integer | optional[a] | 1 | Maximum number of events found in subquery if several share join key. |
mode | string | optional[a] | inner | Specifies the mode (inner or left) of the join. |
Values | ||||
inner | Perform an inner join; return only results that match in both queries | |||
left | Perform a left join, all values from the parent query are included, matched to any corresponding events in subquery. | |||
query [b] | function | required | The subquery to execute producing the values to join with. | |
repo | string | optional[a] | Repo of main query | Specify which view/repo in which to perform the subquery. |
start | string | optional[a] | Start of main query | Specifies either the timestamp relative to the main query's end (for example, start=2h will be two hours before the end of the main query) or an absolute timestamp in milliseconds since UTC. |
view | string | optional[a] | View of main query | Specify which view/repo in which to perform the subquery. |
[a] Optional parameters use their default value unless explicitly set. |
Hide omitted argument names for this function
Omitted Argument NamesThe argument name for
query
can be omitted; the following forms of this function are equivalent:logscale Syntaxjoin("value",field=["value"])
and:
logscale Syntaxjoin(query="value",field=["value"])
These examples show basic structure only.
Hide negatable operation for this function
Negatable Function OperationThis function is negatable, implying the inverse of the result. For example:
logscale Syntax!join()
Or:
logscale Syntaxnot join()
For more information, see Negating the Result of Filter Functions.
Find some examples at join()
Syntax section.
join()
Examples
Click
next to an example below to get the full details.Filter For Items Not Part of Data Set Using !join()
Find the set difference using the join()
function with negation
Query
#repo=A session_id=*
| !join(query={#repo=B session_id=*}, field=session_id, key=session_id)
Introduction
In this example, the join()
function is used with a
negation to search and find all session IDs from data set A that are not
found in data set B.
![]() |
Example incoming data from repository A might look like this:
timestamp | session_id | user_name | action | status |
---|---|---|---|---|
2025-04-01T07:00:00Z | 123456 | john.doe | login | success |
2025-04-01T07:05:00Z | 123457 | jane.smith | download | success |
2025-04-01T07:10:00Z | 123458 | mike.jones | upload | failed |
2025-04-01T07:15:00Z | 123459 | sara.wilson | login | success |
2025-04-01T07:20:00Z | 123460 | bob.brown | logout | success |
Example incoming data from repository B might look like this:
timestamp | session_id | user_name | action | status |
---|---|---|---|---|
2025-04-01T07:00:00Z | 123456 | john.doe | login | success |
2025-04-01T07:05:00Z | 123457 | jane.smith | download | success |
2025-04-01T07:20:00Z | 123460 | bob.brown | logout | success |
Step-by-Step
Starting with the source repository events.
- logscale
#repo=A session_id=*
Filters for all events from repository
A
, that have a session_id field. - logscale
| !join(query={#repo=B session_id=*}, field=session_id, key=session_id)
Performs a negated join with repository
B
, and returns sessions that exist in repositoryA
but not in repositoryB
. The negation operator is used to make it an anti-join operation.LogScale recommends using the
defineTable()
function with!match()
instead of negated join. See example Filter For Items Not Part of Data Set UsingdefineTable()
Event Result set.
Summary and Results
The query is used to find the set difference between two repositories. This is, for example, useful for identifying sync issues or performing data consistency checks. Or just to make a cross-repository comparison.
For more information, see also Query Joins and Lookups
Sample output from the incoming example data:
timestamp | session_id | user_name | action | status |
---|---|---|---|---|
2025-04-01T07:10:00Z | 123458 | mike.jones | upload | failed |
2025-04-01T07:15:00Z | 123459 | sara.wilson | login | success |
Preview Content in a Lookup File With readFile()
and Filter With !join()
Preview content in a lookup file in the search portion of a repo
and filter for specific data with the !join()
function
Query
readFile("host_names.csv")
| !join(query={groupBy(host_name)}, field=host_name, key=host_name, include=[host_name, id])
Introduction
In this example, the readFile()
function is used to
look up a host_names.csv file, and then filter for host names that do
not send any logs.
Example incoming data might look like this:
|--------------------|
| host_name, host_id |
| DESKTOP-VSKPBK8, 1 |
| FINANCE, 2 |
| homer-xubuntu, 3 |
| logger, 4 |
| DESKTOP-1, 5 |
| DESKTOP-2, 6 |
| DESKTOP-3, 7 |
|--------------------|
Step-by-Step
Starting with the source repository events.
- logscale
readFile("host_names.csv")
Displays the content of the .csv file.
- logscale
| !join(query={groupBy(host_name)}, field=host_name, key=host_name, include=[host_name, id])
Filters for host names that do not send any logs.
Event Result set.
Summary and Results
The query is used to preview content in CSV Lookup Files, and then filter for host names that do not send any logs.
Sample output from the incoming example data:
host_id | host_name |
---|---|
5 | DESKTOP-1 |
6 | DESKTOP-2 |
7 | DESKTOP-3 |