match()

`match()`

Matches or joins data from query results with a table. The table can be provided either as a LookUp file — CSV file or through a limited form of JSON file, uploaded using Lookup Files — or, from LogScale 1.163, as an ad-hoc table Using Ad-hoc Tables.

If you are looking for match expressions, see Match Statements.

Parameter	Type	Required	Default Value	Description
`column`	string or array	optional^[a]	`field parameter`	Which column in the file to use for the match. A single column or an array or columns can be specified.
`field`	string or array	required		Which field in the event (log line) must match the given column value. A single field or an array of fields can be specified. Field and column must have the same length, are matched in order and must all match.
`file`^[b]	string	required		Specifies the source file (when using Look Up files) or the name of the ad-hoc table. The file name should be specified with `.csv` or `.json` suffix.
`glob` (deprecated)	boolean	optional^[a]	`false`	This parameter is deprecated. Use `mode` parameter with the `glob` option instead. (deprecated in 1.23.0)
`ignoreCase`	boolean	optional^[a]	`false`	If true, ignore case when matching against the CSV data.
`include`	string or array	optional^[a]		The columns to include. If no argument is given, include all columns from the corresponding row in the output event.
`mode`	string	optional^[a]	`string`	The function to use when matching against keys.
			Valid Values
			`cidr`	The key is interpreted as a CIDR subnet and the event is matched if the field contains an IP within the subnet. If multiple subnets match, the most specific one is selected or an arbitrary one if there are multiple equally specific subnets.
			`glob`	The key is interpreted as a globbing pattern with `` and matched accordingly, for example, a CSV key value of `thisMatch*` would match the field value of `123thisMatch456`.
			`string`	The matching is done using exact string matching.
`nrows`	string	optional^[a]	`1`	The maximum number of rows an event can match with in a CSV file. Valid inputs are any positive number up to 500 or `max` where `max` is equivalent to 500.
`strict`	boolean	optional^[a]	`true`	If `true` (the default) selects only the fields that match a key in the file; if `false` lets all events through (works like the deprecated `lookup()`).
^[a]Optional parameters use their default value unless explicitly set. ^[b]The parameter name `file` can be omitted.

Hide omitted argument names for this function

Show omitted argument names for this function

Hide negatable operation for this function

Show negatable operation for this function

When lookup information from files are loaded from a package, the package name should be specified in addition to the filename. For example:

logscale

match("falcon/investigate/logoninfo.csv",field="loookupname")

For more information on referring to package resources, see Referencing Package Assets.

The default behavior of this function — when strict is set to true — works like an INNER JOIN. When strict is set to false the function enriches events.

When using mode=glob, the underlying CSV is limited to 20,000 rows/lines.

For self-hosted customers, the maximum value for glob matches is configurable using GLOB_MATCH_LIMIT.

Using `match()` with Ad-hoc Tables

match() can be used to perform a join using ad-hoc tables. See Using Ad-hoc Tables for more information.

Using `match()` with Lookup Files

When using match() for joining with Look Up files, two file formats are supported:

CSV. The recommended format, ensures better performance, allows for additional functionalities (using parameters) and is compatible with related functions, like readFile(). See CSV File Formats for more information.
JSON. Supports object and array-based formats. See JSON File Formats for more information.

CSV File Formats

For Comma Separated Values (CSV) files, whitespace gets included in the keys and values. To include the separator "," in a value, quote using the " character. The following file is a valid CSV file:

csv

userid,name
1,chr
2,krab
"4","p,m"
7,mgr

The first line is interpreted as the column title. When querying, the column in the field should be used to identify which column to match against.

When using match() with a single column:

logscale

match(test.csv, field=somefield, column=column1)

the last matching row in test.csv is used.

Since the function supports the selection of an array of columns, you can match multiple pairs of fields and columns against a CSV file. For example, with these events and fields:

json

{
field1: c,
field2: f
},
{
field1: c,
field2: e
}

and the following test.csv match file:

csv

column1, column2, column3
a,        b,      d
c,        d,      a
c,        e,      f

The example query:

logscale

match(test.csv, field=[field1, field2], column=[column1, column2])

will produce the following output:

item	value
field1	c
field2	e
column3	f

Similar to when matching against a single column, in the case of multiple columns, the function will use the last matching row in the file. This behavior is applied starting from version 1.145.

match() supports matching on multiple rows, meaning that you can match a single event with multiple rows. Each matching row will generate a separate event.

Example 1 with nrows=2. With event:

json

{
field1: c
}

and the following test.csv match file:

csv

column1, column2, column3
c,        b,      a
a,        b,      d
c,        d,      a
c,        e,      f

The example query:

logscale

match(test.csv, field=field1, column=column1, nrows=2)

will generate the following output:

item	value
field1	c
column2	e
column3	f
field1	c
column2	d
column3	a

Example 2 with nrows=max. With event:

json

{
field1: c
}

and the following test.csv match file:

csv

column1, column2, column3
c,        b,      a
a,        b,      d
c,        d,      a
c,        e,      f

The example query:

logscale

match(some.csv, field=field1, column=column1, nrows=max)

will generate the following output:

item	value
field1	c
column2	e
column3	f
field1	c
column2	d
column3	a
field1	c
column2	b
column3	a

Example 3 with nrows=2. Given weblog data, HTTP methods could be matched to multiple rows for the type of output:

logscale

method=POST
| match(file="methods.csv",nrows=2,field=method,column="method")

The following methods.csv match file:

csv

method,description
POST,Send Data
POST,Transfer Data in
GET,Retrieve Data
GET,Get Data

could be matched with a longer description, showing two different rows for each matching source row:

method	description	url
POST	Send Data	/humio/api/v1/ingest/elastic-bulk
POST	Transfer Data in	/humio/api/v1/ingest/elastic-bulk
POST	Transfer Data in	/humio/api/v1/ingest/elastic-bulk
POST	Send Data	/humio/api/v1/ingest/elastic-bulk
POST	Transfer Data in	/humio/api/v1/ingest/elastic-bulk
POST	Send Data	/humio/api/v1/ingest/elastic-bulk

When matching multiple rows, multiple events, matching the number of matched nrows, will be generated for each corresponding event and matched lookup entry.

JSON File Formats

For JSON files, two formats are supported:

Object-based, where the lookup field does not have an explicit name
Array-based, where the information is an array of objects

In the Object-based variant, the lookup values are declared as an object with a key and embedded fields, the key field does not have a name.

json

{
  "1": { "name": "chr" },
  "2": { "name": "krab" },
  "4": { "name": "pmm" },
  "7": { "name": "mgr" }
}

When matching against a file in this case, the name of the field in the JSON object does not need to be used; the key for each value is used instead. For example:

logscale

groupBy(@timezone)
| count(@timezone)
| match(file="short.json",field=_count)

In the above, the value of _count will be matched, outputting the match value:

_count	name
2	krab

In the array-based variant, the lookup values are declared as an array of objects, you select which field is the key using the field parameter in match().

json

[
  { "userid": "1", "name": "chr" },
  { "userid": "2", "name": "krab" },
  { "userid": "4", "name": "pmm" },
  { "userid": "7", "name": "mgr" }
]

When using this version, the name of the column to be matched must be specified using the column argument to match():

logscale

groupBy(@timezone)
| count(@timezone)
| match(file="long.json",field=_count,column="userid")

This behavior also means that any field in the JSON file can be used as the match value. For example:

logscale Syntax

...
| match(file="long.json",field=codename,column="name")

This can be useful if you have a JSON file that contains multiple possible lookup values for given records.

For

Important

The match() does not report an error if the file format cannot be parsed.

`match()` Examples

Matches events for which the id field matches the value of the column in the table "users.csv". Does not add any columns.

logscale

match(file="users.csv", column=userid, field=id, include=[])

Matches events for which the id field is matched case-insensitive by the glob-pattern in the column userid in the table users.csv, and add all other columns of the first matching row to those events.

logscale

id =~ match(file="users.csv", column=userid, mode=glob, ignoreCase=true)

Let all events pass through, but events for which the id field matches the value of the userid column in the table users.csv will be enriched with all other columns of the matching row.

logscale

id =~ match(file="users.csv", column=userid, strict=false)

Matches events for which the ip field matches the CIDR subnet of the cidr-block column in the table cidr-file.csv. Only adds the columns info and type from the first matching row.

logscale

match(file="cidr-file.csv", column="cidr-block", field=ip, mode=cidr, include=["info","type"])

The function allows for matching multiple pairs of fields and columns against a CSV file.

Given an event with the following fields:

json

[
   {
      "field1" : "c",
      "field2" : "f"
   },
   {
      "field2" : "e",
      "field1" : "c"
   }
]

and a test.csv file:

csv

column1, column2, column3
a,        b,      d
c,        d,      a
c,        e,      f

The query:

logscale

match(test.csv, field=[field1, field2], column=[column1, column2])

will produce the following output:

column3	field1	field2
f	c	e

Data Analysis Overview

LogScale User Interface

Repositories & Views

Parsing Data

Searching Data

Writing Queries

Query Language Syntax

Query Functions

Dashboards & Widgets

Automation

Template Language

Keyboard Shortcuts

`match()`

Using `match()` with Ad-hoc Tables

Using `match()` with Lookup Files

CSV File Formats

JSON File Formats

Important

`match()` Examples

Enter search term

Data Analysis Overview

LogScale User Interface

Repositories & Views

Parsing Data

Searching Data

Writing Queries

Query Language Syntax

Query Functions

Dashboards & Widgets

Automation

Template Language

Keyboard Shortcuts

Using match() with Ad-hoc Tables

Using match() with Lookup Files

CSV File Formats

JSON File Formats

Important

match() Examples

Enter search term

`match()`

Using `match()` with Ad-hoc Tables

Using `match()` with Lookup Files

`match()` Examples