Collect and Group Events by Specified Field - Example 1
Collect and group events by specified field using collect()
as part of a groupBy()
operation
Query
groupBy(client_ip, function=session(maxpause=1m, collect([url])))
Introduction
The collect()
function can be used to collect
fields from multiple events into one event as part of a
groupBy()
operation. The
groupBy()
function is used to group together events
by one or more specified fields. It is used to extract additional
aggregations from the data and then add calculation to it using the
count()
function.
In this example, the collect()
function is used to
collect visitors, each visitor defined as non-active after one minute.
Step-by-Step
Starting with the source repository events.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} result{{Result Set}} repo --> 0 0 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
groupBy(client_ip, function=session(maxpause=1m, collect([url])))
Collects visitors (URLs), each visitor defined as non-active after one minute and returns the results in an array named client_ip. A count of the events is returned in a _count field.
Event Result set.
Summary and Results
The query is used to collect fields from multiple events into one event.
This query analyzes user behavior by grouping events into sessions for
each unique client IP address. It then collects all URLs accessed during
each session. Collecting should be used on smaller data sets to create a
list (or set, or map, or whatever) when you actually need a list object
explicitly (for example, in order to pass it on to some other API). This
analysis is valuable for understanding user engagement, and identifying
potential security issues based on unusual browsing patterns. Using
collect()
on larger data set may cause out of
memory as it returns the entire data set.