Sort Timestamps With groupBy()
Sorting fields based on aggregated field values
Query
Search Repository: humio
timestamp := formatTime(format="%H:%M")
| groupBy([thread],
function=[{sort("timestamp")
| collect("timestamp")}])
Introduction
When using aggregation, you may want to sort on a field that is
part of the aggregated set but not the main feature of the
aggregated value. For example, sorting the values by their
timestamp rather than the embedded value. To achieve this, you
should use a function that sorts the field to be used as the sort
field, and then use collect()
so that the
value from before the aggregaion can be displayed in the generated
event set. This query can be executed in the humio
respository.
Step-by-Step
Starting with the source repository events.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0[\Add Field/] 1{{Aggregate}} 2>Augment Data] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
timestamp := formatTime(format="%H:%M")
Creates a new field, timestamp formatted as
HH:MM
. - flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0[\Add Field/] 1{{Aggregate}} 2>Augment Data] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
| groupBy([thread],
Groups the events, first by the name of the thread and then the formatted timestamp.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0[\Add Field/] 1{{Aggregate}} 2>Augment Data] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> result style 2 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
function=[{sort("timestamp") | collect("timestamp")}])
Uses the
sort()
combined withcollect()
as the method fo aggregation. As an embedded expression for the function, this will sort the events on the timestamp field and then retrieve the field as it would normally be removed as part of the aggregation process. Event Result set.
Summary and Results
The result set will contain a list of the aggregated thread names sorted by the timestamp:
thread | timestamp |
---|---|
BootstrapInfoJob | 10:09 |
DataSynchJob | 10:09 |
Global event loop | 10:10 |
LocalLivequeryMonitor | 10:09 |
LogCollectorManifestUpdate | 10:09 |
TransientChatter event loop | 10:10 |
aggregate-alert-job | 10:09 |
alert-job | 10:09 |
block-processing-monitor-job | 10:09 |
bloom-scheduler | 10:09 |
bucket-entity-config | 10:09 |
bucket-overcommit-metrics-job | 10:09 |
bucket-storage-download | 10:09 |
bucket-storage-prefetch | 10:09 |
chatter-runningqueries-logger | 10:09 |
chatter-runningqueries-stats | 10:09 |