Detect Event A Happening X Times Before Event B Within a Specific Timespan

Detect event A happening X times before event B within a specific timespan using the slidingTimeWindow() function combined with groupBy()

Query

logscale
head()
| groupBy(
    key,
    function=slidingTimeWindow(
        [{status="failure" | count(as=failures)}, selectLast(status)],
        span=3s
    )
  )
| failures >= 3
| status = "success"

Introduction

In this example, the slidingTimeWindow() function is used with the groupBy() function to detect event A happening X times before event B within a specific timespan.

The query will detect instances where there are 3 or more failed attempts followed by a successful attempt, all occurring within a 3-second window.

Note that the slidingTimeWindow() function must be used after an aggregator function to ensure event ordering. Also note that the events must be sorted in order by timestamp to prevent errors when running the query. It is possible to select any field to use as a timestamp.

Example incoming data might look like this:

@timestampkeystatus
1451606300200cfailure
1451606300400cfailure
1451606300600cfailure
1451606301000afailure
1451606302000afailure
1451606302200afailure
1451606302300afailure
1451606302400bfailure
1451606302500afailure
1451606302600asuccess
1451606303200bfailure
1451606303300csuccess
1451606303400bfailure
1451606304500afailure
1451606304600afailure
1451606304700afailure
1451606304800asuccess

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} 2[/Filter/] 3[/Filter/] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> 3 3 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    head()

    Selects the oldest events ordered by time.

  3. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} 2[/Filter/] 3[/Filter/] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> 3 3 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | groupBy(
        key,
        function=slidingTimeWindow(
            [{status="failure" | count(as=failures)}, selectLast(status)],
            span=3s
        )
      )

    Groups the events by a specified key (for example, a user ID or IP address), then creates a sliding time window of 3 seconds (with a span of 3 seconds).

    Furthermore, it filters all the failed attempts where the field status contains the value failure, makes a count of all the failed attempts, and returns the results in a field named failures, calculates the timespan of the failures, retrieves the timestamp of the last failure, and selects the status of the last event.

  4. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} 2[/Filter/] 3[/Filter/] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> 3 3 --> result style 2 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | failures >= 3

    Filters for windows with 3 or more failures.

  5. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} 1{{Aggregate}} 2[/Filter/] 3[/Filter/] result{{Result Set}} repo --> 0 0 --> 1 1 --> 2 2 --> 3 3 --> result style 3 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | status = "success"

    Filters for partitions containing the value success in the status field.

  6. Event Result set.

Summary and Results

The query is used to detect event A happening X times before event B within a specific timespan. It looks for instances where there were 3 or more failed attempts followed by a successful attempt, all occurring within a 3-second window. Using a sliding time window of 3 seconds, provides a more precise time constraint compared to the usage of partition() in Detect Event A Happening X Times Before Event B.

The query can be used to detect potential brute force attack patterns within a specific timeframe. Note that the effectiveness of this query depends on the nature of your data and the typical patterns in your system.

Sample output from the incoming example data:

keyfailuresstatus
a5success
a7success