Using the match() Function

The match() function enables users to join query data with lookup files containing reference information, commonly used for matching against known security issues, attack vectors, or network types. The documentation explains how to implement basic joins using match() and describes a workflow for automatically updating lookup files through alerts and file uploads, making it particularly useful for maintaining current protocol information and cross-repository data matching.

Image showing the intersection of two datasets, using the match() function as the subquery

The match() function provides a basic join from a query into a lookup file that contains completion information. The most common use-case for this method of joining data to lookup reference information. For example, matching to list of known security issues, attack vectors, or network types. For example, uploaded a version of the port number for different network services, the query:

logscale
port=*
| groupBy([port,type],function=[])
| match("services.csv",field=port)

Matches the port number against known protocols:

porttypebaseprotoprotocol
22authenticationtcpssh
25emailtcpsmtp
443installtcphttps
443weblogtcphttps
80weblogtcphttp

Updating Lookup Files

A common use case for match() is to combine with an automation to create a lookup table that is then used with the query.

In the following diagram, an alert is used to generate results that are uploaded using the upload file action:

%%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% graph LR r["Lookup Repository"] ss("Alert Triggered") uf[["Upload File"]] pq["Query"] j("match()") r --> ss ss --> uf uf --> j pq <-->j

The use case operates as follows:

  • An alert identifies when new protocol types have been updated.

  • When the alert runs, it generates an output list, the data is uploaded as a file using the Upload File action.

  • A dashboard query executes that uses the uploaded file to match against protocol information.

Using this method enables a join between a regularly executed query and a dataset. The two datasets could be across different repositories or the same.