Filter For Items Not Part of Data Set Using !match()

Find the set difference using the match() function with negation

Query

logscale
src_ip=*
| !match("known_ips.csv", field=src_ip)

Introduction

The match() function can be used with a negation to filter for items that are not part of a data set.

In this example, the match() function is used with a negation to search and find IP addresses, that are not part of a known list known_ips.csv.

Example incoming data might look like this:

timestampsrc_ipdst_ipsrc_portdst_portprotocolbytes_sentbytes_received
2025-04-01T07:00:00Z192.168.1.10110.0.0.5052431443TCP10242048
2025-04-01T07:00:01Z172.16.0.248.8.8.83322153UDP64512
2025-04-01T07:00:02Z192.168.1.150172.16.0.1004922380TCP20484096
2025-04-01T07:00:03Z10.0.0.75192.168.1.15567822TCP5121024
2025-04-01T07:00:04Z192.168.1.2001.1.1.14455653UDP64512

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0["Expression"] 1["Expression"] result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    src_ip=*

    Filters for all events that have a src_ip field.

  3. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0["Expression"] 1["Expression"] result{{Result Set}} repo --> 0 0 --> 1 1 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | !match("known_ips.csv", field=src_ip)

    Excludes (filters out) any events where the src_ip field matches entries in the file known_ips.csv, and returns a list of IP addresses that are not found in the specified file. The negation operator is used to return non-matching results.

  4. Event Result set.

Summary and Results

The query is used to search for unknown or unexpected source IP addresses matched up againt a known list. This is useful for detecting potential security theats and monitoring for unauthorized network access.

Sample output from the incoming example data:

timestampsrc_ipdst_ipsrc_portdst_portprotocolbytes_sentbytes_received
2025-04-01T07:00:00Z192.168.1.10110.0.0.5052431443TCP10242048
2025-04-01T07:00:01Z172.16.0.248.8.8.83322153UDP64512