FDR Ingest Problems by Repository |
How many distinct FDR feeds had problems per repository.
Hide Query Show Query #category=Fdr | #severity =~ in(values=["Warning", "Error"]) | dataspace=?{repository=*} fdrFeedName=?{fdrFeedName=*}
| timechart(dataspace, limit=50, function=count(field=fdrFeedName, distinct=true))
| Time Chart |
Errors due to Action Invocation |
Overview of errors with invoking actions when a filter alert
triggers.
Hide Query Show Query #category=FilterAlert #severity="Error" subCategory=Action dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName, actionName], function=[tail(1), count(as="No. of times failed")])
| rename(@timestamp, as="Last failed")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| rename("suggestion", as="Last suggestion")
| rename("message", as="Last message")
| rename("exceptionMessage", as="Last exceptionMessage")
| table(["Repository/view", "Alert name", "No. of times failed", "Last failed", "Last message", "Last suggestion", "Last exceptionMessage", "alertId", "#category"], sortby=["No. of times failed", "Repository/view","Alert name"], order=[desc,asc,asc], limit=1000)
| Table |
Errors with User |
Overview of errors with running filter alerts due to either the
user having been deleted or the user not having permissions to run
the filter alert. Fix this by either granting the user the missing
permissions, change the alert to run as another user, or change
the alert to run on behalf of the organization.
Hide Query Show Query #category=FilterAlert #severity="Error" subCategory=User dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName], function=tail(1))
| rename(@timestamp, as="Last failed")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| rename("message", as="Message")
| rename("suggestion", as="Suggestion")
| table(["Repository/view", "Alert name", "Last failed", "Message", "Suggestion", "alertId", "#category"], sortby=["Repository/view", "Alert name"], order=[asc,asc], limit=1000)
| Table |
Other Errors |
Overview of other errors with running filter alerts than the three
lists above.
Hide Query Show Query #category=FilterAlert #severity="Error" subCategory=Alert dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName, actionName], function=[tail(1), count(as="No. of times failed")])
| rename(@timestamp, as="Last failed")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| rename("message", as="Last message")
| rename("warnings", as="Last warnings")
| rename("exceptionMessage", as="Last exceptionMessage")
| table(["Repository/view", "Alert name", "No. of times failed", "Last failed", "Last message", "Last suggestion", "Last exceptionMessage", "alertId", "#category"], sortby=["No. of times failed", "Repository/view","Alert name"], order=[desc,asc,asc], limit=1000)
| Table |
Action Invocation Warnings |
Overview of warnings with invoking actions when a filter alert
triggers. Note that if the filter alert has multiple actions
attached and at least one succeeds, it is considered to have
triggered.
Hide Query Show Query #category=FilterAlert #severity="Warning" subCategory=Action dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName, actionName], function=[tail(1), count(as="No. of times failed")])
| rename(@timestamp, as="Last failed")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| rename("actionName", as="Action name")
| rename("message", as="Last message")
| rename("warnings", as="Last warnings")
| rename("exceptionMessage", as="Last exceptionMessage")
| table(["Repository/view", "Alert name", "Action name", "No. of times failed", "Last failed", "Last message", "Last suggestion", "Last exceptionMessage", "alertId", "#category"], sortby=["No. of times failed", "Repository/view","Alert name"], order=[desc,asc,asc], limit=1000)
| Table |
Successful Alert Triggers |
This chart displays when the alert successfully triggered.
Hide Query Show Query #category=FilterAlert dataspace=?repository alertName=?alertName
| message="Alert triggered on event and invoked at least one action"
| timechart(alertName)
| Time Chart |
Problems |
Number of error or warning logs per feed as well as the number of
restarts. Unless the feed configuration is changed, a restart
suggests some sort of problem with the feed. Also shows
information about the last problem.
Hide Query Show Query #category=Fdr | #severity =~ in(values=["Warning", "Error"]) | dataspace=?{repository=*} fdrFeedName=?{fdrFeedName=*}
| groupby([dataspace, fdrFeedName], function=[tail(1), count(as="No. of problems"), count(field=streamId, distinct=true, as="restarts")])
| restarts := restarts - 1
| rename(@timestamp, as="Last problem at")
| rename("dataspace", as="Repository")
| rename("fdrFeedName", as="FDR feed name")
| rename("restarts", as="No. of restarts")
| rename("suggestion", as="Last suggestion")
| rename("message", as="Last message")
| rename("exceptionMessage", as="Last exceptionMessage")
| table(["Repository", "FDR feed name", "No. of problems", "No. of restarts", "Last problem at", "Last message", "Last suggestion", "Last exceptionMessage"], sortby=["No. of problems", "Repository", "FDR feed name"], order=[desc,asc,asc], limit=1000)
| Table |
Filter Alerts Lagging Behind by Repository/View |
This chart displays how many distinct filter alerts over time per
repository/view are running historic queries to catch up and not
reacting to new events in the meantime.
Hide Query Show Query #category=FilterAlert isLiveQuery=false dataspace=?{repository=*} alertName=?{alertName=*}
| timechart(dataspace, limit=50, function=count(field=alertName, distinct=true), span=10m)
| Time Chart |
Filter Alerts Lagging Behind |
Overview over filter alerts that are running historic queries to
catch up and not reacting to new events in the meantime.
Hide Query Show Query #category=FilterAlert isLiveQuery=false dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName], function=tail(1))
| rename(@timestamp, as="Last lagging behind")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| table(["Repository/view", "Alert name", "Last lagging behind", "alertId"], sortby=["Last lagging behind", "Repository/view", "Alert name"], order=[desc,asc,asc], limit=1000)
| Table |
Filter Alert Warnings by Repository/View |
This chart displays how many distinct filter alerts had warnings
over time per repository or view.
Hide Query Show Query #category=FilterAlert #severity="Warning" dataspace=?{repository=*} alertName=?{alertName=*}
| timechart(dataspace, limit=50, function=count(field=alertName, distinct=true), span=10m)
| Time Chart |
Filter Alerts Triggered |
Overview of filter alerts that triggered and successfully invoked
at least one action.
Hide Query Show Query #category=FilterAlert message="Alert triggered on event and invoked at least one action" dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName], function=[selectLast(["@timestamp", actionIds, alertId]), count(as="No. of times triggered")])
| rename(@timestamp, as="Last triggered")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| rename("actionIds", as="Action ids")
| table(["Repository/view", "Alert name", "Action ids", "No. of times triggered", "Last triggered", "alertId"], sortby=["No. of times triggered", "Repository/view", "Alert name"], order=[desc,asc,asc], limit=1000)
| Table |
Query Warnings |
Overview of warnings with running the filter alert queries.
Hide Query Show Query #category=FilterAlert #severity="Warning" subCategory=Query dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName], function=[tail(1), count(as="No. of times failed")])
| rename(@timestamp, as="Last failed")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| rename("suggestion", as="Last suggestion")
| rename("message", as="Last message")
| rename("queryWarning", as="Last query warning")
| rename("exceptionMessage", as="Last exceptionMessage")
| table(["Repository/view", "Alert name", "No. of times failed", "Last failed", "Last message", "Last suggestion", "Last exceptionMessage", "Last query warning", "alertId", "#category"], sortby=["No. of times failed", "Repository/view", "Alert name"], order=[desc,asc,asc], limit=1000)
| Table |
Errors with Query |
Overview of errors with running filter alert queries. This can
either be due to an error in the query or due to problems in the
cluster causing errors when trying to run the query.
Hide Query Show Query #category=FilterAlert #severity="Error" subCategory=Query dataspace=?{repository=*} alertName=?{alertName=*}
| groupby([dataspace, alertName, actionName], function=[tail(1), count(as="No. of times failed")])
| rename(@timestamp, as="Last failed")
| rename("dataspace", as="Repository/view")
| rename("alertName", as="Alert name")
| rename("suggestion", as="Last suggestion")
| rename("message", as="Last message")
| rename("exceptionMessage", as="Last exceptionMessage")
| table(["Repository/view", "Alert name", "No. of times failed", "Last failed", "Last message", "Last suggestion", "Last exceptionMessage", "alertId", "#category"], sortby=["No. of times failed", "Repository/view","Alert name"], order=[desc,asc,asc], limit=1000)
| Table |