Diagnosing Alerts

In case of errors occurring when managing alerts, warnings and errors will be reported through the user interface in the Notifications area and through more detailed report in the humio-activity repository (see Monitor Alerts with humio-activity Repository).

Whenever an alert fails — due to errors in the query that triggers it, or in the way an action is configured — error notifications are sent. There will be one notification per alert at most.

Errors or warnings are generated at different points in the execution depending on the alert type, and cleared using different criteria:

Condition Aggregate Alert Filter Alert Legacy Alert
Query start Yes, per alert Yes, per alert Yes, per alert
Query poll Yes, per alert Yes, per alert Yes, per alert
Trigger action Yes, per action against the aggregate result Yes, per action and event; Individual events track both whether they were triggered and whether an action was started successfully. Yes, per action against the aggregate result

Because by default aggregate and filter alerts retry to send events to actions for up to 24 hours, failure notifications will keep reappearing in the UI notifications area for every failed alert, for as long as the error stays on the alert.

An error with an action in a filter alert will only be notified within the UI if the alert has not successfully triggered on the failing event or a later event; if a later action fails the error will be cleared and no indication will be given.

When analyzing errors and warnings for alerts, the following additional factors should be taken into consideration:

  • Errors when running an alert will be stored and also set on the alert as an error, so that they can be seen on the properties' overview page.

  • Errors in legacy and aggregate alerts where multiple Actions have been attached. If some of the actions fail to run, this will be logged, but no error will be set on the alert. The alert will be considered to have fired, and will be throttled as normal. It will only be considered an error if all actions fail.

    For filter alerts, this information is tracked for each event.

  • Warnings aimed at discouraging queries that include a live join() function in legacy alerts. For more information, see Errors when Using Live join() Functions. (Legacy alerts only)

  • Warnings when running an alert will be stored so that they can be seen in the Alerts overview page.

  • The alert status might show warnings appearing on alert queries at start up. This may indicate that LogScale is trying to catch up on ingested data. The default behavior of alerts is to trigger despite most of these query warnings.

  • Errors that require some user interaction, for instance an error in the alert query or that the user has lost permissions to run the query.

  • Warnings that require some user interaction, for instance a warning on too many groups in a groupBy() function invocation in the alert query. In such a case, the alert will still trigger, if relevant, but you should consider rewriting the query.

  • Warnings due to the alert query only returning partial results, which may trigger the alert when it should not have been triggered, or make the alert only return some of the events it would otherwise have returned.