Collects fields from multiple events into one event. It has a limit of 1Kb per key when used as part of a groupBy() operation. This limits the number of values you can index during the aggregation.

ParameterTypeRequiredDefaultDescription
fields[a]Array of stringsrequired  Names of the fields to keep.
limitintegeroptional[b]2000 Limit to number of distinct values in collect.
multivalbooleanoptional[b]true Collects the resulting value as multivalue (a single field value using separator).
separatorstringoptional[b]; Separator used for multiple values.

[a] The argument name fields can be omitted.

[b] Optional parameters use their default value unless explicitly set

Omitted Argument Names

The argument name for fields can be omitted; the following forms of this function are equivalent:

logscale
collect("value")

and:

logscale
collect(fields="value")

The collect() function is limited in the memory for while collecting data before the data is aggregated. The limit changes depending on whether collect() runs as a top level function — in which case its limit is 10 MiB:

logscale
#type = humio #kind=logs
| collect(myField)

or whether it runs in a subquery, or as a sub-aggregator to another function — in which case its limit is 1 MiB:

logscale
#type=humio #kind=logs
groupBy(myField, function=collect(myOtherField))

collect() Examples

Collects visitors, each visitor defined as non-active after one minute.

logscale
groupby(client_ip, function=session(maxpause=1m, collect([url])))

Collect fields from multiple events, counting the collected field:

logscale
LocalAddressIP4 = * RemoteAddressIP4 = * aip = *
| groupBy([LocalAddressIP4, RemoteAddressIP4], function=([count(aip, as=aipCount, distinct=true), collect([aip])]))