Best Practice: Query Monitoring- Blocking and Termination

Query monitoring streamlines the process of maintaining your desired checks by enabling the ability to view live queries being executed within your organization via LogScale- how well you understand the mechanics of query monitoring is important to the long term success of using the tool.

Query Work

The query monitoring page is the primary location to view ongoing queries within the UI, along with resource usage and detailed information about processes. Each query running in LogScale uses CPU and I/O resources at varying levels, this is often referred to as ‘query work'. The cost of a query can be used to compare queries against one another, but not to rate or identify work for a single query. Using the Organization Query Monitor, administrators are able to determine what queries have the largest impact, both positively and negatively.

Considerations

There are several basic recommendations LogScale advises to consider when directing your approach for blocking and terminating queries. The two main concepts deal with cost and duration. Queries that use too many resources can be temporarily stopped or completely blocked, effectively preventing the query from ever being run again.

Cost: Cost should not be abnormally high- any query that costs 10x what other queries do should be evaluated for potential block/termination.

Duration: Query duration exceeding 6 hours should be evaluated for potential block/termination. Note that any query with a duration greater than 12 hours is likely a stuck query.

Query Syntax

The way a query is structured greatly impacts its performance. It's important to be thoughtful when constructing queries, particularly when they are in use on a scheduled basis. For more information, visit our LogScale Query Language best practices: Writing Better Queries

The Query Monitor UI

There are components of the Organization Query Monitor UI that are helpful to know when ensuring its proper utilization, including:

  • The ‘Query details' tab

    • This tab displays the syntax of the query, which is helpful in determining the query's feasibility based on its construction

  • The ‘Initiated by' field (query owner)

    • This field is self-explanatory- it tells you who initiated the query.

  • The ‘Age' field (duration)

    • The ‘Age' field provides the age of the query, i.e. how long it's been running

  • ‘Total cost' field

    • The ‘Total cost' field tells you how costly the query is

  • ‘Block and kill' tab

    • This tab houses the buttons to block and kill a query on demand. You can block a specific query from running until it's removed from a blocked list.

    • An alternative to this tab that provides similar results is to terminate a running instance of a specific query

Note

The ‘Block and kill' tab is user-specific and dependent on role-based access.

For more details on the UI and it's components, see our documentation: Organization Query Monitor

Query Monitoring Best Practices

Mitigating problematic queries is a matter of evaluation and choosing the appropriate response. To evaluate queries, you may look at several factors, including the total cost of a query, the age of a query, and a query's status. These are all located in their respective columns, which can be adjusted to give you better insights.

Total Cost, Age, and Status columns

The Total Cost, Age, and Status columns can be sorted in descending order, which helps identify runaway queries. The ‘Age' column in particular can help identify hanging or ‘stuck' queries when sorted in descending order. By looking at the ‘Status' column and then the ‘Age' column, it's possible to identify these instances by looking for queries that are nearing completion, and possess an age of 6 hours or more. Similarly, queries that have a ‘Total Cost' of zero in its respective column while also being at or over 6 hours old is an indication that the query is stuck and unlikely to be completed.

Note

A good baseline for defining ‘nearly completion' is 85% or more complete.

General Rules

A few general rules for engaging in query monitoring:

  • Any queries above 350K should be watched.

  • Any queries above 1M K is likely a query that is preventing other users from using LogScale.

Query Examples

Let's look at three scenarios that illustrate both the queries that are most likely to need resolving and the best practices for doing so.

Example 1- Expensive, Long, But Not Stuck
an example of a query that is expensive and long, but not "stuck"

In this example, 20 minutes indicates a long running query, but the status is 73%, so the query is progressing. This is an expensive long running query. However, it's not a stuck query and isn't a candidate to be killed or blocked.

Example 2 - Total Cost is Too High
an example of a query where the total cost to run it is much too high

In this example the total cost is well over 1M. This implies that the query work involved is using too many resources. Any queries above 1M is likely a query that is preventing other users from using LogScale.

Example 3 - The Query Age is Old

In this example the Age of the query is 17 days and the Total Cost is 0. This query is static and should have been completed and off the list. This would be a query instance that is suitable to be terminated.

an example of a query where the age of the query is old and should be terminated

Query Monitoring and Using the Blocklist

When a query is creating an issue, remediation might involve the use of LogScale's blocklist. This is a way to prevent queries from executing, particularly when the query is using a large amount of resources.

blocklist menu

Note

A block is permanent until it's removed. Expensive or stuck queries may still be useful- if a block is meant to be temporary, be mindful of this.

For more information about the blocklist and blocking queries, see our documentation here: Blocking Queries