Filter For Items Not Part of Data Set Using !join()
Find the set difference using the join()
function with negation
Query
#repo=A session_id=*
| !join(query={#repo=B session_id=*}, field=session_id, key=session_id)
Introduction
The join()
function can be used with a negation to
filter for items that are not part of a data set, for example, items in
data set A not found in data set B.
In this example, the join()
function is used with a
negation to search and find all session IDs from data set A that are not
found in data set B.
Example incoming data from repository A might look like this:
timestamp | session_id | user_name | action | status |
---|---|---|---|---|
2025-04-01T07:00:00Z | 123456 | john.doe | login | success |
2025-04-01T07:05:00Z | 123457 | jane.smith | download | success |
2025-04-01T07:10:00Z | 123458 | mike.jones | upload | failed |
2025-04-01T07:15:00Z | 123459 | sara.wilson | login | success |
2025-04-01T07:20:00Z | 123460 | bob.brown | logout | success |
Example incoming data from repository B might look like this:
timestamp | session_id | user_name | action | status |
---|---|---|---|---|
2025-04-01T07:00:00Z | 123456 | john.doe | login | success |
2025-04-01T07:05:00Z | 123457 | jane.smith | download | success |
2025-04-01T07:20:00Z | 123460 | bob.brown | logout | success |
Step-by-Step
Starting with the source repository events.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0["Expression"] 1["Expression"] result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
#repo=A session_id=*
Filters for all events from repository
A
, that have a session_id field. - flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0["Expression"] 1["Expression"] result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
| !join(query={#repo=B session_id=*}, field=session_id, key=session_id)
Performs a negated join with repository
B
, and returns sessions that exist in repositoryA
but not in repositoryB
. The negation operator is used to make it an anti-join operation.LogScale recommends using the
defineTable()
function with!match()
instead of negated join. See example Filter For Items Not Part of Data Set UsingdefineTable()
Event Result set.
Summary and Results
The query is used to find the set difference between two repositories. This is, for example, useful for identifying sync issues or performing data consistency checks. Or just to make a cross-repository comparison.
For more information, see also Query Joins and Lookups
Sample output from the incoming example data:
timestamp | session_id | user_name | action | status |
---|---|---|---|---|
2025-04-01T07:10:00Z | 123458 | mike.jones | upload | failed |
2025-04-01T07:15:00Z | 123459 | sara.wilson | login | success |