Query Scheduling
LogScale uses query scheduling internally to help manage the sequence and order of query execution.
When a query is submitted, LogScale divides the work the query needs to perform, and distributes the work to the available worker nodes in the cluster.
In order to provide quality of service, LogScale needs to divide the available compute capacity on a worker between running queries.
Each query belongs to a known user and user group ("organization"). LogScale uses a three layer structure for prioritization:
The compute time is divided evenly between all organizations.
The compute time assigned to an organization is divided evenly among the users running queries in that organization.
The time a user is assigned is divided evenly between the queries that user is running.
As an example, if the worker is running 3 queries, and query 1 belongs to organization X and user X1, and queries 2 and 3 belong to organization Y and user Y1, then query 1 should get 50% of the worker's capacity, while query 2 and 3 each get 25% of the capacity.
This method of prioritization reduces the impact of "noisy neighbors" when multiple users are running queries concurrently, and also ensures that the users who are penalized the hardest will be the ones who are most responsible for the load on the cluster.
As part of executing queries, the worker may need to fetch data from bucket storage or (much less commonly) other nodes. When prioritizing such fetches, workers try to prevent any query from running out of "runway" on the local data, in the sense that the query will always have some local data available to search. The worker will estimate how much compute cost the query can accrue based on local data, and will then prioritize fetching data for the queries with the lowest amount of estimated cost available in local segments. This optimizes for not allowing any query to be bottlenecked by segment fetching, to the extent allowed by the download capacity of the node.