Compound fields contain multiple pieces of information to report and/or
search on, contained within a single field. Alternatively, they may be
arrays parsed into an array field within events that then must be
summarized.
For example, User Agent data in
logs contains browser identifiers separated with spaces that define
browser and toolkits used to support them:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
The following solutions use a variety of methods to extract and
aggregate the information.
Click + next to an example below to get the full details.
splitString(field=userAgent,by=" ",as=agents)
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
|array:union(array=agents,as=browsers)
| split(browsers)
Deduplicating fields of information where there are multiple occurrences
of a value in a single field, maybe separated by a single character can
be achieved in a variety of ways. This solution uses
array:union()
and
split
create a unique array and
then split the content out to a unique list.
For example, when examining the humio and looking for the
browsers or user agents that have used your instance, the
UserAgent
data will contain the
browser and toolkits used to support them, for example:
The actual names are the
Name/Version
pairs showing
compatibility with different browser standards. Resolving this into a
simplified list requires splitting up the list, simplifying (to remove
duplicates), filtering, and then summarizing the final list.
Starting with the source repository events.
splitString(field=userAgent,by=" ",as=agents)
First we split up the
userAgent field using a call
to splitString()
and place the output into the
array field agents
This will create individual array entries into the
agents array for each event:
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
|array:union(array=agents,as=browsers)
Using array:union()
we aggregate the list of user
agents across all the events to create a list of unique entries. This
will eliminate duplicates where the value of the user agent is the
same value.
The event data now looks like this:
An array of the individual values.
Using the split()
will split the array into
individual events, turning:
into:
Event Result set.
The resulting output from the query is a list of events with each event
containing a matching _index and
browser. This can be useful if you want to perform further processing on
a list of events rather than an array of values.