Compute Cumulative Aggregation For Specific Group
Compute a cumulative aggregation for a specific group using the accumulate()
function with groupBy()
Query
head()
| groupBy(key, function = accumulate(sum(value)))
Introduction
The accumulate()
function can be used to calculate
running totals, averages, or other cumulative metrics over time or
across a series of events. The accumulate()
function applies an aggregation function cumulatively to a sequence of
events.
In this example, to compute a cumulative aggregation for a specific
group (for example, by user), the accumulate()
function is used inside the groupBy()
function.
Note that the accumulate()
function must be used
after an aggregator function to ensure event ordering.
Example incoming data might look like this:
key | value |
---|---|
a | 5 |
b | 6 |
a | 1 |
c | 2 |
b | 6 |
Step-by-Step
Starting with the source repository events.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
head()
Selects the oldest events ordered by time.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
| groupBy(key, function = accumulate(sum(value)))
Accumulates the sum of a field named value, groups the data by a specified key and returns the results in a field named _sum.
Event Result set.
Summary and Results
The query is used to compute a cumulative aggregation for a specific group, in this example using the value field.
Sample output from the incoming example data:
key | _sum | value |
---|---|---|
a | 5 | 5 |
a | 6 | 1 |
b | 6 | 6 |
b | 12 | 6 |
c | 2 | 2 |