Collects fields from multiple events into one event. It has a limit of 1Kb per key when used as part of a groupBy() operation. This limits the number of values you can index during the aggregation.

ParameterTypeRequiredDefaultDescription
fields[a]Array of stringsrequired  Names of the fields to keep.
limitintegeroptional[b]2000 Limit to number of distinct values in collect.
multivalbooleanoptional[b]true Collects the resulting value as multivalue (a single field value using separator).
separatorstringoptional[b]\n Separator used for multiple values.

[a] The argument name fields can be omitted.

[b] Optional parameters use their default value unless explicitly set

Hide omitted argument names for this function

Show omitted argument names for this function

The collect() function is limited in the memory for while collecting data before the data is aggregated. The limit changes depending on whether collect() runs as a top level function — in which case its limit is 10 MiB:

logscale
#type = humio #kind=logs
| collect(myField)

or whether it runs in a subquery, or as a sub-aggregator to another function — in which case its limit is 1 MiB:

logscale
#type=humio #kind=logs
groupBy(myField, function=collect(myOtherField))

collect() Examples

Collects visitors, each visitor defined as non-active after one minute.

logscale
groupby(client_ip, function=session(maxpause=1m, collect([url])))

Collect fields from multiple events, counting the collected field:

logscale
LocalAddressIP4 = * RemoteAddressIP4 = * aip = *
| groupBy([LocalAddressIP4, RemoteAddressIP4], function=([count(aip, as=aipCount, distinct=true), collect([aip])]))