Alert Query For Parsers Issues

Reporting errors

Query

flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result
flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result
logscale
#type=humio #kind=logs
| loglevel=WARN
| class = c.h.d.ParserLimitingJob
| "Setting reject ingest for"
| groupBy(id, function=[count(), min(@timestamp), max(@timestamp)] )
| timeDiff:=_max-_min
| timeDiff > 300000 and _count > 10

Introduction

The groupBy() function can be used to aggregate events across one or more fields and apply multiple aggregate functions simultaneously to each group. By combining groupBy() with count(), min(), and max(), it is possible to produce a compact summary of event activity per group, including the total number of occurrences and the time boundaries within which those occurrences were observed. This makes groupBy() well suited for alert queries where both the frequency and the duration of an issue must exceed a threshold before an alert is triggered.

This alert query tries to balance reacting when there are problems with parsers, without being too restrictive.

In this example, the groupBy() function is used to group parser warning events by their id field, and aggregate counts and timestamp boundaries to identify sustained periods of parser rejection activity.

Example incoming data might look like this:

@timestamp#type#kindloglevelclassid@rawstring
1742032800000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742032830000humiologsINFOc.h.d.ParserLimitingJobrepo-alphaParser job started for repo-alpha
1742032860000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742032920000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742032980000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033040000humiologsERRORc.h.d.IngestPipelinerepo-alphaIngest pipeline error for repo-alpha
1742033100000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033160000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033220000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033280000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033340000humiologsINFOc.h.d.ParserLimitingJobrepo-alphaParser throughput nominal for repo-alpha
1742033400000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033460000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033520000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033580000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742033640000humiologsWARNc.h.d.ParserLimitingJobrepo-alphaSetting reject ingest for repo-alpha due to parser limit exceeded
1742032900000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742032960000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033020000humiologsDEBUGc.h.d.ParserLimitingJobrepo-betaParser debug trace for repo-beta
1742033080000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033140000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033200000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033260000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033320000humiologsINFOc.h.d.IngestPipelinerepo-betaIngest pipeline resumed for repo-beta
1742033380000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033440000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033500000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033560000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033620000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742033680000humiologsWARNc.h.d.ParserLimitingJobrepo-betaSetting reject ingest for repo-beta due to parser limit exceeded
1742032950000humiologsWARNc.h.d.ParserLimitingJobrepo-gammaSetting reject ingest for repo-gamma due to parser limit exceeded
1742033150000humiologsWARNc.h.d.ParserLimitingJobrepo-gammaSetting reject ingest for repo-gamma due to parser limit exceeded
1742033350000humiologsERRORc.h.d.ParserLimitingJobrepo-gammaParser error encountered for repo-gamma
1742033550000humiologsWARNc.h.d.ParserLimitingJobrepo-gammaSetting reject ingest for repo-gamma due to parser limit exceeded
1742033750000humiologsWARNc.h.d.ParserLimitingJobrepo-gammaSetting reject ingest for repo-gamma due to parser limit exceeded
1742033100000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033115000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033130000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033145000humiologsINFOc.h.d.ParserLimitingJobrepo-deltaParser job heartbeat for repo-delta
1742033160000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033175000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033190000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033205000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033220000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033235000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033250000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded
1742033265000humiologsWARNc.h.d.ParserLimitingJobrepo-deltaSetting reject ingest for repo-delta due to parser limit exceeded

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    #type=humio #kind=logs

    Filters on all logs across all hosts in the cluster.

  3. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 2 fill:#ff0000,stroke-width:4px,stroke:#000;
    flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 2 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | loglevel=WARN

    Filters for all events where the loglevel is equal to WARN.

  4. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 3 fill:#ff0000,stroke-width:4px,stroke:#000;
    flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 3 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | class = c.h.d.ParserLimitingJob

    Filters for events where the class field equals c.h.d.ParserLimitingJob. Events originating from other classes, such as c.h.d.IngestPipeline, are discarded at this step.

  5. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 4 fill:#ff0000,stroke-width:4px,stroke:#000;
    flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 4 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | "Setting reject ingest for"

    Filters for events containing the string Setting reject ingest for. This is the error message generated when ingested events are rejected.

    Events that match the class and loglevel filters but do not contain this string are discarded at this step.

  6. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 5 fill:#ff0000,stroke-width:4px,stroke:#000;
    flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 5 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | groupBy(id, function=[count(), min(@timestamp), max(@timestamp)] )

    Groups the returned result by the field id, makes a count on the events and returns the minimum timestamp and maximum timestamp. This returns a new event set, with the fields id, _count, _min, and _max.

  7. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 6 fill:#ff0000,stroke-width:4px,stroke:#000;
    flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 6 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | timeDiff:=_max-_min

    Calculates the time difference between the maximum timestamp values and the minimum timestamp values and returns the result in a new field named timeDiff.

  8. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 7 fill:#ff0000,stroke-width:4px,stroke:#000;
    flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1[/Filter/] 2[/Filter/] 3[/Filter/] 4[/Filter/] 5{{Aggregate}} 6["Expression"] 7["Expression"] result{{Result Set}} repo --> 1 1 --> 2 2 --> 3 3 --> 4 4 --> 5 5 --> 6 6 --> 7 7 --> result style 7 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | timeDiff > 300000 and _count > 10

    Returns all events where the values of timeDiff is greater that 300000 and where there are more than 10 occurrences.

  9. Event Result set.

Summary and Results

The query is used to set up alerts for parser issues. Setting up alerts for parser issues allows proactive identification of repositories where ingest is being throttled due to sustained parser rejection activity.

This query is useful, for example, to identify repositories that have been experiencing more than 10 parser rejection events over a period exceeding 5 minutes, enabling support teams to reach out proactively to affected customers.

Sample output from the incoming example data:

id_count_min_maxtimeDiff
repo-alpha1317420328000001742033640000840000
repo-beta1217420329000001742033680000780000