Filters events from the input array using the function provided in the array.
The order is maintained in the output array.
Parameter | Type | Required | Default Value | Description |
---|---|---|---|---|
array [a] | string | required | The array name in LogScale Array Syntax, for example, for events with fields incidents[0], incidents[1], ... this would be incidents[] , as in array:filter(array="incidents[]", ...) . | |
asArray | string | optional[b] | The output array. Defaults to the value passed to the array parameter. | |
function | Non-aggregate function | required | The function to use for filtering events in the array. | |
var | string | required | Name of the variable to be used in function argument. | |
[b] Optional parameters use their default value unless explicitly set. |
Hide omitted argument names for this function
Omitted Argument NamesThe argument name for
array
can be omitted; the following forms of this function are equivalent:logscalearray:filter("value[]",var="value",function="value")
and:
logscalearray:filter(array="value[]",var="value",function="value")
These examples show basic structure only.
array:filter()
Examples
Deduplicate Compound Field Data
Query
splitString(field=userAgent,by=" ",as=agents)
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
|array:union(array=agents,as=browsers)
|transpose()
|drop(column)
|groupBy(row[1])
Introduction
Deduplicating fields of information where there are multiple
occurrences of a value in a single field, maybe separated by a
single character can be achieved in a variety of ways. This
solution makes use of array:union()
,
transpose()
and
groupBy()
to collate and aggregate the
information into the final list of values.
For example, when examining the humio and looking
for the browsers or user agents that have used your instance,
the UserAgent
data will
contain the browser and toolkits used to support them, for
example:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
The actual names are the
Name/Version
pairs showing
compatibility with different browser standards. Resolving this
into a simplified list requires splitting up the list,
simplifying (to remove duplicates), filtering, and then
summarizing the final list.
The process we need to follow is first extract the information into an array of values that we can then simplify and aggregate. It's possible the information in your raw data is already stored in an array of information that needs to be summarised.
Step-by-Step
Starting with the source repository events.
- logscale
splitString(field=userAgent,by=" ",as=agents)
Splits up the userAgent field using a call to
splitString()
and places the output into the array field agentsThis will create individual array entries into the agents array for each event:
agents[0]="Mozilla/5.0" agents[1]="(Macintosh;" agents[2]="Intel" agents[3]="Mac" agents[4]="OS" agents[5]="X" agents[6]="10_15_7)" agents[7]="AppleWebKit/537.36" agents[8]="(KHTML," agents[9]="like" agents[10]="Gecko)" agents[11]="Chrome/116.0.0.0" agents[12]="Safari/537.36" - logscale
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
- logscale
|array:union(array=agents,as=browsers)
Using
array:union()
we aggregate the list of user agents across all the events to create a list of unique entries. This will eliminate duplicates where the value of the user agent is the same value. - logscale
|transpose()
Using the
transpose()
function, we transpose the rows and columns for the list of matching field names and values, turning:browsers[0] browsers[1] browsers[2] Gecko/20100101 Safari/537.36 AppleWebKit/605.1.15 into:
column row[1] browsers[0] Gecko/20100101 browsers[1] Safari/537.36 browsers[2] AppleWebKit/605.1.15 - logscale
|drop(column)
We do not need the column information, just the unique list of browsers in the row[1] column, so we drop the field from the event list.
- logscale
|groupBy(row[1])
Now we can aggregate the list, by the remaining field row[1] to provide the unique list of potential values. This is not a count of the times each has occurred, but a unique list of all the different possible values. The resulting list looks like this:
Event Result set.
Summary and Results
The resulting output from the query is a summarized list of the unique possible values from the original source fields, even though the source information was originally contained within a single field in the source events.
row[1] | _count |
---|---|
AppleWebKit/537.36 | 1 |
AppleWebKit/605.1.15 | 1 |
CFNetwork/1410.0.3 | 1 |
Chrome/116.0.0.0 | 1 |
Darwin/22.6.0 | 1 |
Firefox/116.0 | 1 |
Gecko/20100101 | 1 |
Mobile/15E148 | 1 |
Mozilla/5.0 | 1 |
Safari/18615.3.12.11.2 | 1 |
Safari/537.36 | 1 |
Safari/604.1 | 1 |
Safari/605.1.15 | 1 |
Version/16.4 | 1 |
Version/16.6 | 1 |
Deduplicate Compound Field Data With array:union()
and split()
Query
splitString(field=userAgent,by=" ",as=agents)
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
|array:union(array=agents,as=browsers)
split(browsers)
Introduction
Deduplicating fields of information where there are multiple
occurences of a value in a single field, maybe seaprated by a
single character can be achieved in a variety of ways. This
solution uses array:union()
and
split
create a unique array and then split
the content out to a unique list.
For example, when examining the humio and looking
for the browsers or user agents that have used your instance,
the UserAgent
data will
contain the browser and toolkits used to support them, for
example:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
The actual names are the
Name/Version
pairs showing
compatibility with different browser standards. Resolving this
into a simplified list requires splitting up the list,
simplifying (to remove duplicates), filtering, and then
summarizing the final list.
Step-by-Step
Starting with the source repository events.
- logscale
splitString(field=userAgent,by=" ",as=agents)
First we split up the userAgent field using a call to
splitString()
and place the output into the array field agentsThis will create individual array entries into the agents array for each event:
agents[0]="Mozilla/5.0" agents[1]="(Macintosh;" agents[2]="Intel" agents[3]="Mac" agents[4]="OS" agents[5]="X" agents[6]="10_15_7)" agents[7]="AppleWebKit/537.36" agents[8]="(KHTML," agents[9]="like" agents[10]="Gecko)" agents[11]="Chrome/116.0.0.0" agents[12]="Safari/537.36"
- logscale
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
- logscale
|array:union(array=agents,as=browsers)
Using
array:union()
we aggregate the list of user agents across all the events to create a list of unique entries. This will eliminate duplicates where the value of the user agent is the same value.The event data now looks like this:
browsers[0] browsers[1] browsers[2] Gecko/20100101 Safari/537.36 AppleWebKit/605.1.15 An array of the individual values.
- logscale
split(browsers)
Using the
split()
will split the array into individual events, turning:browsers[0] browsers[1] browsers[2] Gecko/20100101 Safari/537.36 AppleWebKit/605.1.15 into:
_index row[1] 0 Gecko/20100101 1 Safari/537.36 2 AppleWebKit/605.1.15 Event Result set.
Summary and Results
The resulting output from the query is a list of events with each event containing a matching _index and browser. This can be useful if you want to perform further processing on a list of events rather than an array of values.
Filter an Array on a Given Condition
Filter the elements of a flat array on a given condition using the array filter function array:filter()
Query
array:filter(array="mailto[]", var="addr", function={addr=ba*@example.com}, asArray="out[]")
Introduction
It is possible to filter an array on a given condition using the
array filter function array:filter()
. The
array:filter()
creates a new array with
elements matching the specified conditions and does not change the
original array. The new array will retain the original order.
Example incoming data might look like this:
mailto[0]=foo@example.com
mailto[1]=bar@example.com
mailto[2]=baz@example.com
Step-by-Step
Starting with the source repository events.
- logscale
array:filter(array="mailto[]", var="addr", function={addr=ba*@example.com}, asArray="out[]")
Filters the mailto[] array to include only elements that contain the value
ba*@example.com
, this is achieved by testing the value of each element of the array, set by thevar
parameter asaddr
, returning a new array that only contains elements that meet the specified condition. The expression in thefunction
argument should contain the field declared in theaddr
parameter. Event Result set.
Summary and Results
The query is used to filter values from the input array using the function provided in the array and return a new array with the results meeting the specified condition.
Sample output from the incoming example data:
out[0]=bar@example.com
out[1]=baz@example.com