Compute Cumulative Aggregation For Specific Group

Compute a cumulative aggregation for a specific group using the accumulate() function with groupBy()

Query

logscale
head()
| groupBy(key, function = accumulate(sum(value)))

Introduction

The accumulate() function can be used to calculate running totals, averages, or other cumulative metrics over time or across a series of events. The accumulate() function applies an aggregation function cumulatively to a sequence of events.

In this example, to compute a cumulative aggregation for a specific group (for example, by user), the accumulate() function is used inside the groupBy() function.

Note that the accumulate() function must be used after an aggregator function to ensure event ordering.

Example incoming data might look like this:

keyvalue
a5
b6
a1
c2
b6

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    head()

    Selects the oldest events ordered by time.

  3. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | groupBy(key, function = accumulate(sum(value)))

    Accumulates the sum of a field named value, groups the data by a specified key and returns the results in a field named _sum.

  4. Event Result set.

Summary and Results

The query is used to compute a cumulative aggregation for a specific group, in this example using the value field.

Sample output from the incoming example data:

key_sumvalue
a55
a61
b66
b126
c22