Search For Events by Size in Repository

Search for events of a certain size in a repository using eventSize() function

Query

flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[\Add Field/] 2>Augment Data] result{{Result Set}} repo --> 1 1 --> 2 2 --> result
logscale
eventSize()
| _eventSize > 10000

Introduction

The eventSize() function is used to search for events depending on the internal disk storage usages. The function augments the event data with the event size information.

Example incoming data might look like this:

@timestampmessageuserip_address
2025-10-31T10:00:00.000ZShort log messagealice192.168.1.100
2025-10-31T10:01:00.000ZVery long detailed error message with stack trace: Error at line 1234\nStack trace:\ncom.example.Class.method(Class.java:100)\ncom.example.OtherClass.otherMethod(OtherClass.java:200)\ncom.example.MainClass.main(MainClass.java:300)\nCaused by: java.lang.NullPointerException\nat com.example.Class.method(Class.java:100)bob192.168.1.101
2025-10-31T10:02:00.000ZMedium length message with some details about user activity and system statuscharlie192.168.1.102
2025-10-31T10:03:00.000ZAnother very long message containing detailed system metrics: CPU usage: 85%, Memory: 16GB used of 32GB total, Disk usage: 75% on /dev/sda1, Network: IN=1.2GB/s OUT=800MB/s, Active connections: 1250, Thread count: 500, Active users: 3500, Cache hit ratio: 95%, Database connections: 100/150david192.168.1.103
2025-10-31T10:04:00.000ZBrief status updateeve192.168.1.104

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[\Add Field/] 2>Augment Data] result{{Result Set}} repo --> 1 1 --> 2 2 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    eventSize()

    Determines the number of bytes that events internally use in disk storage for the values (not counting the bytes for storing the field names), and returns the results in a field named _eventSize.

  3. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[\Add Field/] 2>Augment Data] result{{Result Set}} repo --> 1 1 --> 2 2 --> result style 2 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | _eventSize > 10000

    Searches for events that take up more than 10000 bytes in internal disk storage usage. Notice that you cannot do a direct comparison, as the function augments the event data with the event size information, rather than returning data.

  4. Event Result set.

Summary and Results

The query is used to get an overview of the disk storage usage of the different events and in this example filter on the largest ones. A high disk storage usage can cause performance issues, depending on the time range.

Sample output from the incoming example data:

messageuserip_address_eventSize
Very long detailed error message with stack trace: Error at line 1234\nStack trace:\ncom.example.Class.method(Class.java:100)\ncom.example.OtherClass.otherMethod(OtherClass.java:200)\ncom.example.MainClass.main(MainClass.java:300)\nCaused by: java.lang.NullPointerException\nat com.example.Class.method(Class.java:100)bob192.168.1.10112500
Another very long message containing detailed system metrics: CPU usage: 85%, Memory: 16GB used of 32GB total, Disk usage: 75% on /dev/sda1, Network: IN=1.2GB/s OUT=800MB/s, Active connections: 1250, Thread count: 500, Active users: 3500, Cache hit ratio: 95%, Database connections: 100/150david192.168.1.10311200

Note that only events with an _eventSize greater than 10000 bytes are included in the results. The _eventSize field shows the internal storage size in bytes for each event.