Sort Timestamps With groupBy()

Sorting fields based on aggregated field values

Query

Search Repository: humio

logscale
timestamp := formatTime(format="%H:%M")
| groupBy([thread],
function=[{sort("timestamp")
| collect("timestamp")}])

Introduction

When using aggregation, you may want to sort on a field that is part of the aggregated set but not the main feature of the aggregated value. For example, sorting the values by their timestamp rather than the embedded value. To achieve this, you should use a function that sorts the field to be used as the sort field, and then use collect() so that the value from before the aggregaion can be displayed in the generated event set. This query can be executed in the humio respository.

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0[\Add Field/] 1{{Aggregate}} 2>Augment Data] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    timestamp := formatTime(format="%H:%M")

    Creates a new field, timestamp formatted as HH:MM.

  3. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0[\Add Field/] 1{{Aggregate}} 2>Augment Data] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | groupBy([thread],

    Groups the events, first by the name of the thread and then the formatted timestamp.

  4. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0[\Add Field/] 1{{Aggregate}} 2>Augment Data] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> result style 2 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    function=[{sort("timestamp")
    | collect("timestamp")}])

    Uses the sort() combined with collect() as the method fo aggregation. As an embedded expression for the function, this will sort the events on the timestamp field and then retrieve the field as it would normally be removed as part of the aggregation process.

  5. Event Result set.

Summary and Results

The result set will contain a list of the aggregated thread names sorted by the timestamp:

threadtimestamp
BootstrapInfoJob10:09
DataSynchJob10:09
Global event loop10:10
LocalLivequeryMonitor10:09
LogCollectorManifestUpdate10:09
TransientChatter event loop10:10
aggregate-alert-job10:09
alert-job10:09
block-processing-monitor-job10:09
bloom-scheduler10:09
bucket-entity-config10:09
bucket-overcommit-metrics-job10:09
bucket-storage-download10:09
bucket-storage-prefetch10:09
chatter-runningqueries-logger10:09
chatter-runningqueries-stats10:09