Group Events by Single Field Without Count

Basic grouping of events by status_code field with explicit empty function parameter using the groupBy() function

Query

flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1{{Aggregate}} result{{Result Set}} repo --> 1 1 --> result
logscale
groupBy(status_code, function=[])

Introduction

The groupBy() function can be used to group events based on unique values in specified fields, with an optional function parameter to specify aggregate calculations.

The groupBy() function can be used to execute aggregate functions on each group. The results are returned in the field parameter for each aggregate function. For example, the _count field if the count() function is used. The default is to use count() as an aggregate function, since the most common use-case is to count the distinct values of a field.

If count() should not be used as an aggregate function, it is therefore necessary to add an empty list as the aggregate function to prevent something from being counted.

In this example, the groupBy() is used to group events by their status codes without calculating count.

Example incoming data might look like this:

@timestampstatus_codeendpointresponse_time
1686837825000200/api/users145
1686837826000404/api/products89
1686837827000200/api/orders167
1686837828000500/api/payment890
1686837829000200/api/users156
1686837830000404/api/items78
1686837831000200/api/orders178
1686837832000500/api/checkout923
1686837833000200/api/products134
1686837834000404/api/users92

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1{{Aggregate}} result{{Result Set}} repo --> 1 1 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    groupBy(status_code, function=[])

    Groups events by unique values in the status_code field. The empty function array (function=[]) prevents automatic counting.

    This approach helps conserve memory while identifying the unique status codes in the events.

  3. Event Result set.

Summary and Results

The query is used to identify unique field values (in this case different status codes) while minimizing memory usage.

This query is useful, for example, to quickly discover unique values in large event sets and support initial data exploration before detailed analysis.

For other examples with groupBy(), see groupBy() Syntax Examples.

Sample output from the incoming example data:

status_code
200
404
500