Aggregate Alerts
An aggregate alert is the recommended alert type for any queries containing aggregate functions; when the query returns an aggregate result with one or more rows, the alert is triggered. Aggregate alerts guarantee at least once delivery to the actions for more robust results, even in case of ingest delays of up to 24 hours, or errors with the cluster or the query.
Aggregate alerts have the following attributes and behaviors.
Aggregate alerts execute a live query within a search interval and return the results from an aggregate query to act as the content (and data) for the alert.
An alert is triggered against the query only when the query returns one or more results.
All the values within the result set from the query are available when triggering an action.
Events matching an aggregate query can also be sent to Actions. See Sending Aggregate Results to Actions.
Aggregate alerts can be throttled to prevent the query triggering a configured action too frequently. See Setting Alert Throttle Period.
If the configured throttle period matches the search interval, queries are run back-to-back meaning that the next interval starts right after the first one. This way, no time interval is missed and/or considered twice during query execution.
If there are instabilities in the system, the failed queries will rerun up to 24 hours so that alerts can catch up if the system has been down. This behavior guarantees reliability of the alert in case of an infrastructure failure.
Aggregate alerts DO NOT support: Join Query Functions, explicit bucketing functions like
bucket()
andtimeChart()
,start()
,end()
, andnow()
functions.Aggregate alerts may contain Filtering Query Functions, but the whole query must contain at least one aggregate function.
In case of query warnings, aggregate alerts will not fail. Query warnings are of four different types:
Warnings that are ignored, since they cannot happen or do not matter.
Warnings that are logged and reported in the status of the alert, but otherwise continue despite.
Warnings that are treated as errors, since they mean that the result can be completely wrong.
Warnings about missing data. If they happen in a live query, they are treated based on the triggering mode, similar to how ingest delays are handled (see FAQ: How Does LogScale Handle Ingest Delays in Aggregate Alerts). If they happen when running a historic query, retries are made for up to 10 minutes, before giving up and just using the result that is given.
The humio-activity repository will list all activity logs with a suggestion of how to resolve the issue depending on the level of severity, see Monitor Alerts with humio-activity Repository for more information.
The environment variable
ENABLE_AGGREGATE_ALERTS
must be set totrue
on every host in the cluster.If an alert query is not executing it is advised to wait for 20 minutes, after which it should get restarted. If restarting the query is not possible, or if you cannot wait that long, it is recommended to disable the alert, wait for one minute and then enable the alert again. This will "reset" the alert to only run from now on and not retry any missed data.