The defineTable() function is available
from version 1.163
Executes a subquery that generates an in-memory, ad-hoc table
based on its results. The ad-hoc table can be joined with the
results of the primary query using the
match() function. For more information on
using Ad-hoc tables, see
Using Ad-hoc Tables.
defineTable() is the recommended
alternative to the join() function,
allowing for easier query writing of complex joins. For more
explanations on the benefits of using ad-hoc tables with
defineTable() instead of
join(), see
Ad-hoc Tables vs. join().
End of time interval of subquery: milliseconds since UNIX epoch or a timestamp relative to the primary query's end time using Relative Time Syntax. For example: if start=7d and the main query's end time is 2024-03-258T14:00:00, then the defineTable() subquery will start at 2024-03-18T14:00:00.
Start of time interval of subquery: milliseconds since UNIX epoch or a timestamp relative to the primary query's end time using Relative Time Syntax. For example: if start=7d and the main query's end time is 2024-03-25T14:00:00, then the defineTable() subquery will start at 2024-03-18T14:00:00.
The first query in the pipeline is the subquery used for table
definition. The second query in the pipeline is the primary
query that uses match().
The following example query combines information about the
ProcessRollUp2 and
NetworkListenIP4 to find
processes that have created listeners on a port.
Known Issue:Inconsistencies in timestamps and time zones v1.163-1.165
Up until v1.165, the following wrong behaviors have been
observed in defineTable() and have both
been fixed in v1.166:
The function did not use the ingest timestamp for time
range specification provided by the primary query, but
used the event timestamp instead. This issue only
affected queries where the primary query used ingest
timestamps.
The function incorrectly used UTC time zone for query
start and end timestamps, regardless of the primary
query's time zone. This issue only affected queries
where the primary query used a non-UTC time zone, and
either of the following:
the primary query's time interval used
calendar-based presets (like
calendar:2d, or now@week),
or:
the sub-query used any query function that uses the
time zone, for example
timeChart(),
bucket(), and any
time:* function.
In this example, the defineTable() function is used
with a !match() to search and find all session IDs
from data set A that are not found in data set B.
Example incoming data from repository A might look like this:
timestamp
session_id
user_name
action
status
2025-04-01T07:00:00Z
123456
john.doe
login
success
2025-04-01T07:05:00Z
123457
jane.smith
download
success
2025-04-01T07:10:00Z
123458
mike.jones
upload
failed
2025-04-01T07:15:00Z
123459
sara.wilson
login
success
2025-04-01T07:20:00Z
123460
bob.brown
logout
success
Example incoming data from repository B might look like this:
Generates an ad-hoc table from repository B named
session_ids and filters on all events in
repository B that contain the field
session_id.
logscale
#repo=Asession_id=*
Filters for all events from repository A, that
contain a session_id field.
logscale
|!match(table=session_ids, field=session_id)
Performs a negated match with repository B, and
returns sessions that exist in repository A but not
in repository B. The negation operator is used to
return non-matching results.
Event Result set.
Summary and Results
The query is used to find the set difference between two repositories.
This is, for example, useful for identifying sync issues or performing
data consistency checks. Or just to make a cross-repository comparison.
In this example, the defineTable() function is used
as a left join query to extract and combine information from two
different datasets.
The event set for the query is in one repository, but the event set for
each query is shown separately to identify the two sets of information.
The first event set is:
Similar to the inner join
example, defineTable() first creates a separate
table for organizations belonging to a different view, which is then
matched against a users' table as a nested-like join.
The event set for the query is in one repository, but the event set for
each query is shown separately to identify the two sets of information.
The first event set is:
Generates an ad-hoc table named
organization_table that has the fields
username and
orgId and includes users where
orgId field equals
1 from the
organizations view.
In this example, the defineTable() function is used
as a right join query to extract and combine information from two
different datasets.
The event set for the query is in one repository, but the event set for
each query is shown separately to identify the two sets of information.
The first event set is:
Defines a new ad-hoc table that uses all the fields
(username and
operation) in a table named
operations.
logscale
|readFile(users)
Reads the users ad-hoc table as events
using readFile().
logscale
|match(operations,field=username,strict=false)
Matches the events that have a matching operation from the
operations table with the
users table using the
username as the common field.
Events are not filtered if the events do not match, (implying a right
join), by using strict=false
logscale
|select([username,operation])
Selects the username and
operation fields to be displayed
from the event set.
Event Result set.
Summary and Results
The result will output two events:
username
operation
user1
createdFile
user2
no value
Note that in the event set all operations have been included even when
there is no match between the
username field, resulting in the
no value for
user2. If
strict=true had been used to
the match() function, then the event for
user2 would not have been outputted.
Perform an Inner Join Query to Combine Two Datasets
In this example, the defineTable() function is used
as an inner join query to extract and combine information from two
different datasets.
The event set for the query is in one repository, but the event set for
each query is shown separately to identify the two sets of information.
The first event set is:
In this example, the setTimeInterval() function is
used with the defineTable() function to define a
new time interval for the subqueries, before running this.
Generates an ad-hoc table named
ended_queries and computes the relative
time points to the primary query's time end time. This means that the
subquery will start at 7d+30min,
and will end at 1d+30min
In this example, the defineTable() is used to
extract and combine information from two different datasets, with the
mapping between username and
orgId stored in a CSV file. The
example file organizations.csv
has the following content:
CSV
username,orgId
user1,1
user2,1
user3,2
The event set for the query is in one repository, but the event set for
each query is shown separately to identify the two sets of information.
The first event set is:
Generates an ad-hoc table named
users_table that has the fields
username and
name and includes users where
orgId field equals
1. Then uses
match() to enrich rows with
orgId from
organizations.csv file.
logscale
|operation=createdFile
Filters on the field operation
for users who performed the action of creating a file by looking for the
value createdFile.
logscale
|match(table=users_table, field=username)
Joins the username field with
the users_table table, to filter out
users who are not from orgId=1
and to enrich with the users' names.
logscale
|select([username,name])
Selects the username and
name fields to be displayed from
the event set.