The partition() function is available
from version 1.174.0.
The partition() function splits a sequence
of events into multiple partitions (subsequences) based on a
condition. It allows you to apply a sub-aggregation to each
partition separately, useful for grouping related events and
performing calculations within these groups, while keeping the
order.
[b] Optional parameters use their default value unless explicitly set.
Hide omitted argument names for this functionShow omitted argument names for this function
Omitted Argument Names
The argument name for function can be omitted; the following forms of this function are equivalent:
logscale Syntax
partition("value",condition=false)
and:
logscale Syntax
partition(function="value",condition=false)
These examples show basic structure only.
Events can also be separated by group using the
groupBy() function. In that case, the
events will be sorted by group, whereas
partition() keeps the order of events and
splits the groups when the condition is true. An advantage of
using partition() is that it, generally,
has a lower memory footprint than
groupBy().
Note
The partition() function must be
used after an aggregator function (for example,
head(),
sort(),
bucket(),
groupBy()
timeChart()) to ensure event ordering, as the
partition() function requires a
specific order to calculate cumulative values correctly.
Accumulations can be partitioned based on a condition, such as a change
in value. This is achieved by combining the three functions
partition(), neighbor() and
accumulate(). In this example, the combination of
the 3 sequence functions is used to count events within partitions
defined by changes in a key field.
Note that sequence functions must be used after an aggregator function
to ensure event ordering.
Example incoming data might look like this:
key
a
a
a
b
a
b
b
Step-by-Step
Starting with the source repository events.
logscale
head()
Selects the oldest events ordered by time.
logscale
|neighbor(key, prefix=prev)
Accesses the value in the field key from the
previous event.
The partition() function splits the sequence of
events based on the specified condition. A new partition starts when
the current key value is different from the
previous key value. Within each partition, it
counts the number of events, and returns the results in a field named
_count.
Event Result set.
Summary and Results
The query is used to compute an accumulated count of events within
partitions based on a specific condition, in this example change in
value for the field key.
Sample output from the incoming example data:
key
_count
prev.key
a
1
<no value>
a
2
a
a
3
a
b
1
a
a
1
b
b
1
a
b
2
b
The query is useful for analyzing sequences of events, especially when
you want to identify and count consecutive occurrences of a particular
attribute in order to identify and analyze patterns or sequences within
your data.
Detect All Occurrences of Event A Before Event B
Detect all occurrences of event A before event B (brute force attack) using the partition() function combined with groupBy()
Query
logscale
head()|groupBy(key,
function =partition(condition=test(status=="success"),split="after",
[
{ status="failure" | count(as=failures)},range(@timestamp,as=timespan),selectLast(status)]))|failures>=3|status="success"
Introduction
In this example, the partition() function is used
with the groupBy() function to detect all
occurrences of event A before event B (brute force attack).
The query will detect instances where there were 3 or more failed
attempts followed by a successful attempt within the specified 10-second
window.
Note that the partition() function must be used
after an aggregator function to ensure event ordering. Also note that
the events must be sorted in order by timestamp to prevent errors when
running the query. It is possible to select any field to use as a
timestamp.
Example incoming data might look like this:
@timestamp
key
status
1451606300200
c
failure
1451606300400
c
failure
1451606300600
c
failure
1451606301000
a
failure
1451606302000
a
failure
1451606302200
a
failure
1451606302300
a
failure
1451606302400
b
failure
1451606302500
a
failure
1451606302600
a
success
1451606303200
b
failure
1451606303300
c
success
1451606303400
b
failure
1451606304500
a
failure
1451606304600
a
failure
1451606304700
a
failure
1451606304800
a
success
Step-by-Step
Starting with the source repository events.
logscale
head()
Selects the oldest events ordered by time.
logscale
|groupBy(key,
function =partition(condition=test(status=="success"),split="after",
[
{ status="failure" | count(as=failures)},range(@timestamp,as=timespan),selectLast(status)]))
Groups the events by a specified key (for example, a user ID or IP
address), filters for successful events (filters for events that meet
the defined condition for the field status that
must contain the value success), then splits the
data after each successful event. Notice how the condition is provided
as a non-aggregate subquery.
Furthermore, it filters all the failed attempts where the field
status contains the value
failure.
Makes a count of all the failed attempts, and returns the results in a
field named failures,
calculates the timespan of the failures, and selects the status of the
last event. Calculating the timespan of the failure sequence, is
useful for analysis.
logscale
|failures>=3
Filters for partitions with 3 or more failures.
logscale
|status="success"
Filters for partitions containing the value success
in the status field.
Event Result set.
Summary and Results
The query is used to detect all occurrences of potential brute force
attack patterns. It looks for instances where there were 3 or more
failed attempts (event A) followed by a successful attempt (event B),
regardless of the time between failures. The timespan between each
attempt is reported, which could be used to identify brute force
attacks.
Sample output from the incoming example data:
key
failures
timespan
status
a
5
1600
success
a
3
300
success
c
3
3100
success
Detect Event A Happening X Times Before Event B
Detect event A happening X times before event B (brute force attack) using the partition() function combined with groupBy()
Query
logscale
head()|groupBy(key,
function =partition(condition=test(status=="success"),split="after",
[
{ status="failure" | count(as=failures)},range(@timestamp,as=timespan),max(@timestamp),selectLast(status)]))|failures>=3|status="success"
Introduction
In this example, the partition() function is used
with the groupBy() function to detect event A
happening X times before event B (brute force attack).
The query will detect instances where there were 3 or more failed
attempts followed by a successful attempt within the specified 10-second
window.
Note that the partition() function must be used
after an aggregator function to ensure event ordering. Also note that
the events must be sorted in order by timestamp to prevent errors when
running the query. It is possible to select any field to use as a
timestamp.
Example incoming data might look like this:
@timestamp
key
status
1451606300200
c
failure
1451606300400
c
failure
1451606300600
c
failure
1451606301000
a
failure
1451606302000
a
failure
1451606302200
a
failure
1451606302300
a
failure
1451606302400
b
failure
1451606302500
a
failure
1451606302600
a
success
1451606303200
b
failure
1451606303300
c
success
1451606303400
b
failure
1451606304500
a
<no value>
1451606304600
c
failure
1451606304700
c
failure
1451606304800
c
failure
Step-by-Step
Starting with the source repository events.
logscale
head()
Selects the oldest events ordered by time.
logscale
|groupBy(key,
function =partition(condition=test(status=="success"),split="after",
[
{ status="failure" | count(as=failures)},range(@timestamp,as=timespan),max(@timestamp),selectLast(status)]))
Groups the events by a specified key (for example, a user ID or IP
address), then splits the sequence of events after each successful
event (where the condition status=="success").
For each partition, it counts the number of failure
in status and stores it in the field
failures, finds the range of
timestamps in the partition, finds the newest timestamp, and finds the
latest status to show if the partition ended with a success.
logscale
|failures>=3
Filters for partitions that contained 3 or more failures.
logscale
|status="success"
Filters for partitions with the value success in
the status field to ensure that the final status
is a success.
Event Result set.
Summary and Results
The query is used to detect instances where there are 3 or more failed
attempts followed by a successful attempt. The query can be used to
detect a brute force attack where an attacker tries multiple times
before succeeding. Note that the effectiveness of this query depends on
the nature of your data and the typical patterns in your system.
Sample output from the incoming example data:
key
failures
timespan
status
a
5
1600
success
a
3
300
success
c
3
3100
success
Divide Data Into Separate Partitions
Divide data into separate partitions based on a specific condition using the partition() function
In this example, the partition() function is used
with the head() function to count the number of
events in each group, starting a new count whenever the field
splitHere is
true.
Note that the partition() function must be used
after an aggregator function to ensure event ordering.
Example incoming data might look like this:
splitHere
false
false
false
true
false
Step-by-Step
Starting with the source repository events.
logscale
head()
Selects the oldest events ordered by time.
logscale
|partition(count(),condition=test(splitHere))
Splits the events (creates a new subsequence) whenever the
splitHere field is
true and counts the number of events in each
partition (group), returning the results in a field named
_count.
For example, if you have five events and the third one has the field
splitHere as
true, two groups are created:
Group 1: 3 events (including the
"true" point)
Group 2: 2 events
Note that it is possible to split after the event, where the condition
is true: