Group Events by Single Field

Basic grouping of events by status_code field using the groupBy() function

Query

flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1{{Aggregate}} result{{Result Set}} repo --> 1 1 --> result
logscale
groupBy(status_code)

Introduction

The groupBy() function can be used to group events based on unique values in specified fields, automatically providing a count of events for each group.

This is similar to the GROUP BY method in SQL databases.

The groupBy() function can be used to execute aggregate functions on each group. The results are returned in the field parameter for each aggregate function. For example, the _count field if the count() function is used. The default is to use count() as an aggregate function, since the most common use-case is to count the distinct values of a field.

If count() should not be used as an aggregate function, it is therefore necessary to add an empty list as the aggregate function to prevent something from being counted.

In this example, the groupBy() is used to group events by their status codes to analyze the distribution of different response statuses.

Example incoming data might look like this:

@timestampstatus_codeendpointresponse_time
1686837825000200/api/users145
1686837826000404/api/products89
1686837827000200/api/orders167
1686837828000500/api/payment890
1686837829000200/api/users156
1686837830000404/api/items78
1686837831000200/api/orders178
1686837832000500/api/checkout923
1686837833000200/api/products134
1686837834000404/api/users92

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1{{Aggregate}} result{{Result Set}} repo --> 1 1 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    groupBy(status_code)

    Groups events by unique values in the status_code field. When used without any aggregate functions, groupBy() automatically creates a field named _count showing the number of events for each unique value.

    It is the same as: groupBy(status_code, function=count())

  3. Event Result set.

Summary and Results

The query is used to analyze the distribution of status codes across all events.

The _count field is automatically added to show the number of events in each group.

This query is useful, for example, to monitor system health, identify error patterns, or track the frequency of different response types in a service.

For other examples with groupBy(), see groupBy() Syntax Examples.

Sample output from the incoming example data:

status_code_count
2005
4043
5002