Monitoring Alert Execution through the humio-activity Repository
The humio/activity package provides a wealth of information about activity within LogScale and should be installed to help monitor alerts.
Examine the category field in the humio-activity repo to track progress and any errors generated when executing alerts.
Alert marks standard alerts
FilterAlert marks filter alerts
The status field indicates
either a Success
or
Failure
. Repeated
entries with a failure indicate an error should be investigated.
There are three different Success
scenarios for Legacy Alerts:
LogScale successfully polled the alert query, found events to trigger on, and successfully triggered at least one of the associated actions
LogScale successfully polled the alert query, found events to trigger on, but the alert was throttled
LogScale successfully polled the alert query, but found no events to trigger on
For filter alerts, the following scenarios indicate
Success
:
LogScale successfully polled the alert query, found events to trigger on, and successfully triggered at least one of the associated actions
LogScale successfully polled the alert query, but found no events to trigger on
The subCategory then indicates whether the event relates to the execution of the Alert, Query, or Action.
Checking the severity field will indicate the level of the event:
Info
entries are used to indicate when an alert has been triggered or other informational messages. No action is required.Warning
indicates an issue either with the alert, reading the result, or triggering actions, or where an alert has not been triggered due to throttling. In some cases, the warning resolves on its own. But if the message persists, it may require action.Error
indicates an error, for example running the query or trigger. Requires action.
The following additional fields in each event contain more detailed information for each alert invocation or error; for a full example event, see Alert Raw Event Example:
Field | Description |
---|---|
actionId | ID of the triggered action; only set for the invocation of a specific action |
actionIds | List of action ids for when an alert trigger has been triggered |
actionInvocationId | Unique id for the invocation of an action, can be used to correlate logs, same commenas for actionId |
actionInvocationIds | List of action invocation ids for when an alert has been triggered |
actionName | Name of the action that generated an error or would have been triggered |
actionName | Name of the triggered action; only set for the invocation of a specific action |
alertId | ID of the alert |
alertName | Name of the alert |
alertTime | The timestamp when the alert was triggered. |
dataspace | Name of the repository or view |
eventId | The eventId when an alert trigger on an event |
events | The number of the events returned by the query; by default all queries return a maximum of 200. Where no events were returned by the query the value will be 0. |
eventsToTriggerOn | When polling a filter alert query |
exceptionMessage | A detailed error message that will include errors at the cluster-level that may have contributed; for example permission, API, or network issues |
externalQueryId | The external id of the running query |
lastAlertTime | The timestamp of the last time the alert triggered |
message | The error or warning message for the alert |
query | The alert query executed |
queryProcessedEvents | The number of events processed to return the final result set. |
queryTimeMillis | Time in milliseconds since the underlying live query of the alert has been started. |
status |
Indicates whether the alert was successful (value
Success ) or failed (value
Failure ).
An individual failure may be triggered for multiple reasons,
but repeated failures over a period of time may indicate a
problem that needs investigation.
|
suggestion | A guide to the warning or error and how to resolve or identify more information |
user | The user the query runs on behalf of (run-as-user) |
viewId | ID of the view for the alert |
Alert Errors and Resolutions
When investigating errors to identify issues, make use of the message and suggestion fields to provide guidance on why an issue has occurred. The pages below below describe each message type.