Available:accumulate() v1.174.0

The accumulate() function is available from version 1.174.0.

The accumulate() function applies an aggregation function cumulatively to a sequence of events. It is useful for calculating running totals, running averages, or other cumulative metrics over time or across a series of events.

For more information about sequence functions and combined usage, see Sequence Query Functions.

ParameterTypeRequiredDefault ValueDescription
currentenumoptional[a] include Controls whether to include the current event in the accumulation.
   Values
   excludeExclude current event in the accumulation
   includeInclude current event in the accumulation
function[b]array of aggregate functionsrequired   The aggregator function to accumulate (for example, sum(), avg(), count()). It only accepts functions that output at most a single event.

[a] Optional parameters use their default value unless explicitly set.

[b] The parameter name function can be omitted.

Hide omitted argument names for this function

Show omitted argument names for this function

Note

  • The accumulate() function must be used after an aggregator function (for example, head(), sort(), bucket(), groupBy() timeChart()) to ensure event ordering, as the accumulate() function requires a specific order to calculate cumulative values correctly.

  • Only functions (for example, sum(), avg(), count()) that output a single event can be used in the sub-aggregation because the accumulate() function needs a single value to add to its running total for each event.

accumulate() Examples

Click + next to an example below to get the full details.

Calculate Running Average of Field Values

Calculate a running average of values in a dataset using the accumulate() function

Query
logscale
head()
| accumulate(avg(value))
Introduction

In this example, the accumulate() function is used with the avg() function to calculate a running average of the field value.

Note that the accumulate() function must be used after an aggregator function, in this example the head() function, to ensure event ordering.

Example incoming data might look like this:

keyvalue
a5
b6
c1
d2
Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    head()

    Ensures that the events are ordered by time, selecting the oldest events.

  3. logscale
    | accumulate(avg(value))

    Computes the running average of all values, including the current one, using the accumulate() function with the avg() aggregator.

  4. Event Result set.

Summary and Results

The query is used to calculate the running average of fields. The query calculates moving averages that change as new values arrive.

Sample output from the incoming example data:

_avgkeyvalue
5a5
5.5b6
4c1
3.5d2

Compute Cumulative Aggregation Across Buckets

Compute a cumulative aggregation across buckets using the accumulate() function with timeChart()

Query
logscale
timeChart(span=1000ms, function=sum(value))
| accumulate(sum(_sum, as=_accumulated_sum))
Introduction

In this example, the accumulate() function is used with timeChart() to accumulate values across time intervals.

Note that the accumulate() function must be used after an aggregator function to ensure event ordering.

Example incoming data might look like this:

@timestampkeyvalue
1451606301001a5
1451606301500b6
1451606301701a1
1451606302001c2
1451606302201b6
Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    timeChart(span=1000ms, function=sum(value))

    Groups data into 1-second buckets over a 4-second period, sums the field value for each bucket and returns the results in a field named _sum. The result is displayed in a timechart.

  3. logscale
    | accumulate(sum(_sum, as=_accumulated_sum))

    Calculates a running total of the sums in the _sum field, and returns the results in a field named _accumulated_sum.

  4. Event Result set.

Summary and Results

The query is used to accumulate values across time intervals/buckets. The query is useful for tracking cumulative metrics or identifying trends in the data.

Sample output from the incoming example data:

_bucket_sum_accumulated_sum
145160630000000
14516063010001212
1451606302000820
1451606303000020

The timechart looks like this:

Timechart displaying accumulated aggregation across buckets

Compute Cumulative Aggregation For Specific Group

Compute a cumulative aggregation for a specific group using the accumulate() function with groupBy()

Query
logscale
head()
| groupBy(key, function = accumulate(sum(value)))
Introduction

In this example, to compute a cumulative aggregation for a specific group (for example, by user), the accumulate() function is used inside the groupBy() function.

Note that the accumulate() function must be used after an aggregator function to ensure event ordering.

Example incoming data might look like this:

keyvalue
a5
b6
a1
c2
b6
Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    head()

    Selects the oldest events ordered by time.

  3. logscale
    | groupBy(key, function = accumulate(sum(value)))

    Accumulates the sum of a field named value, groups the data by a specified key and returns the results in a field named _sum.

  4. Event Result set.

Summary and Results

The query is used to compute a cumulative aggregation for a specific group, in this example using the value field.

Sample output from the incoming example data:

key_sumvalue
a55
a61
b66
b126
c22

Count Events Within Partitions Based on Condition

Count events within partitions based on a specific condition using the partition() function combined with neighbor() and accumulate()

Query
logscale
head()
| neighbor(key, prefix=prev)
| partition(accumulate(count()), condition=test(key != prev.key))
Introduction

Accumulations can be partitioned based on a condition, such as a change in value. This is achieved by combining the three functions partition(), neighbor() and accumulate(). In this example, the combination of the 3 sequence functions is used to count events within partitions defined by changes in a key field.

Note that sequence functions must be used after an aggregator function to ensure event ordering.

Example incoming data might look like this:

key
a
a
a
b
a
b
b
Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    head()

    Selects the oldest events ordered by time.

  3. logscale
    | neighbor(key, prefix=prev)

    Accesses the value in the field key from the previous event.

  4. logscale
    | partition(accumulate(count()), condition=test(key != prev.key))

    The partition() function splits the sequence of events based on the specified condition. A new partition starts when the current key value is different from the previous key value. Within each partition, it counts the number of events, and returns the results in a field named _count.

  5. Event Result set.

Summary and Results

The query is used to compute an accumulated count of events within partitions based on a specific condition, in this example change in value for the field key.

Sample output from the incoming example data:

key_countprev.key
a1<no value>
a2a
a3a
b1a
a1b
b1a
b2b

The query is useful for analyzing sequences of events, especially when you want to identify and count consecutive occurrences of a particular attribute in order to identify and analyze patterns or sequences within your data.

Detect Changes And Compute Differences Between Events - Example 2

Detect changes and compute differences between events using the neighbor() function combined with accumulate()

Query
logscale
head()
| neighbor(start, prefix=prev)
| duration := start - prev.start
| accumulate(sum(duration, as=accumulated_duration))
Introduction

In this example, the neighbor() function is used with accumulate() to calculate a running total of durations.

Note that the neighbor() function must be used after an aggregator function to ensure event ordering.

Example incoming data might look like this:

start
1100
1233
3002
4324
Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    head()

    Selects the oldest events ordered by time.

  3. logscale
    | neighbor(start, prefix=prev)

    Retrieves the value in the start field from preceeding event, and assigns this value to the current event's data in a new field named prev.start.

  4. logscale
    | duration := start - prev.start

    Calculates the time difference between current and previous start values, and returns the results - the calculated difference - in a field named duration.

  5. logscale
    | accumulate(sum(duration, as=accumulated_duration))

    Calculates a running total sum of the values in the field duration and returns the results in a field named accumulated_duration. Each event will show its individual duration and the total accumulated duration up to that point.

  6. Event Result set.

Summary and Results

The query is used to calculate the time difference between consecutive events. The format of this query is a useful pattern when analyzing temporal data, to provide insights into process efficiency, system performance over time etc. In this example, the acculated_duration provides a value that can be used to compare against the duration field within each event.

Sample output from the incoming example data:

startaccumulated_durationdurationprev.start
11000<no value><no value>
12331331331100
3002190217691233
4324322413223002

For example, in the results, the third event shows a large increase in duration against the accumulated_duration and the start time of the previous event (in prev.start). If analyzing execution times of a process, this could indicate a fault or delay compared to previous executions.