Find Processes with Low Execution Count

Group processes by hash and name to identify rarely executed ones using the groupBy() function

Query

flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3{{Aggregate}} 4[/Filter/] 5{{Aggregate}} result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> result
logscale
#event_simpleName=ProcessRollup2 OR #event_simpleName=SyntheticProcessRollup2
aid=?aid
groupBy([SHA256HashData, ImageFileName], limit=max)
_count < 5
sort(_count, limit=1000)

Introduction

The groupBy() function can be used to group events by specified fields and perform aggregate calculations on the grouped data.

In this example, the groupBy() function is used to identify processes that have been executed only a few times on a specific host, which could be useful for detecting unusual or potentially suspicious activity.

Example incoming data might look like this:

@timestampevent_simpleNameaidSHA256HashDataImageFileNameCommandLine
2025-10-06T10:00:00ZProcessRollup212345abca1b2c3d4e5f6...chrome.exeC:\Program Files\Google\Chrome\Application\chrome.exe
2025-10-06T10:05:00ZProcessRollup212345abca1b2c3d4e5f6...chrome.exeC:\Program Files\Google\Chrome\Application\chrome.exe
2025-10-065T10:10:00ZSyntheticProcessRollup212345abcf6e5d4c3b2a1...suspicious.exeC:\Users\Admin\Downloads\suspicious.exe
2025-10-06T10:15:00ZProcessRollup212345abc98765432dcba...notepad.exeC:\Windows\System32\notepad.exe
2025-10-06T10:20:00ZProcessRollup212345abc98765432dcba...notepad.exeC:\Windows\System32\notepad.exe
2025-10-06T10:25:00ZProcessRollup212345abc11223344aabb...calc.exeC:\Windows\System32\calc.exe

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3{{Aggregate}} 4[/Filter/] 5{{Aggregate}} result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    #event_simpleName=ProcessRollup2 OR #event_simpleName=SyntheticProcessRollup2

    Filters events to include only process execution events with event_simpleName equal to ProcessRollup2 or SyntheticProcessRollup2.

  3. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3{{Aggregate}} 4[/Filter/] 5{{Aggregate}} result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> result style 2 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    aid=?aid

    Filters events for a specific host using the aid (agent ID) parameter.

  4. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3{{Aggregate}} 4[/Filter/] 5{{Aggregate}} result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> result style 3 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    groupBy([SHA256HashData, ImageFileName], limit=max)

    Groups events by both the SHA256HashData and ImageFileName fields. The limit parameter is set to max to ensure all groups are included.

    By default the count() function is used and the grouped count returned in a field named _count.

  5. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3{{Aggregate}} 4[/Filter/] 5{{Aggregate}} result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> result style 4 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    _count < 5

    Filters the groups to show only those with fewer than 5 executions, using the built-in _count field that is automatically created by groupBy().

  6. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3{{Aggregate}} 4[/Filter/] 5{{Aggregate}} result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> result style 5 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    sort(_count, limit=1000)

    Sorts the results by execution count in ascending order, limiting the output to 1000 results.

  7. Event Result set.

Summary and Results

The query is used to identify processes that have been executed infrequently on a specific host by grouping them based on their hash value and image name.

This query is useful, for example, to detect potentially suspicious or unusual processes that do not run often, which could indicate malicious activity or unauthorized software installations.

Sample output from the incoming example data:

SHA256HashDataImageFileName_count
f6e5d4c3b2a1...suspicious.exe1
11223344aabb...calc.exe1
98765432dcba...notepad.exe2

The results are sorted by execution count, showing the least frequently executed processes first. Each row represents a unique combination of process hash and name, along with how many times it was executed.

Processes with the same name but different hashes are treated as separate entries, helping identify potentially malicious files masquerading as legitimate processes.