Query Writing Best Practices
Specificity matters when writing LogScale queries. The more specific you can be when writing a query, the fewer results you will have to sort through and the faster the query will run. Writing queries that run efficiently and reduce the cost of CPU and memory usage must follow an ordered flow through the query pipeline, starting with the repository.
Find a reduced set of events by filtering first, and then only perform the formatting and modification of the found data as the formatting has to be performed on every event, which can be a CPU intensive process.
When Writing Queries it is best to follow the following process in order:
Narrow down the search timeframe as much as possible
Narrow down the result set starting with tagged fields (those starting with
#
)Continue to filter the dataset with remaining Field Values that exist
After you have filtered what you want to see, filter what you don't want to see
Utilize any regex needed to filter down the data further, or for unparsed fields
Transform your data how you would like with functions like Math, Evaluation, format, etc
Aggregate your data utilizing any aggregate functions, such as a
sum()
,top()
, orgroupBy()
Perform any final visualization processing such as sorting or table functions.
Note
In some situations, it may be necessary to perform steps in a different order. For example, transforming data after it has been aggregated, or transforming and visualizing in the same step.
When optimizing queries, you want to lower the total number of hits and events returned, then focus on lowering the total work cost of the query, and then handle the formatting and output. Monitoring the work and performance and making small changes as you optimized the query will help to get the best overall performance.
This process in code format looks like this:
tag-filters
| field filters
| transformations
| aggregate function
| post processing of aggregate result
Remember that the order of operations for queries is important. Let's look at an example of a resource-costly query:
| eval(fStart = @timestamp - 600)
| eval(fEnd = @timestamp + 600)
| formattime("%F %T.%Q", as=Timestamp, field=@timestamp, timezone=UTC)
| table([fStart, Timestamp, fEnd])
This query is particularly costly because it formats timestamps for all events in the defined timespan.
Specifying a limit of 200 events prior to output transformation will make the query run much faster because it displays only 200 events, thus performing fewer format operations:
| table([@timestamp],limit=200) // selects the last 200 timestamps by timestamp
| eval(fStart = @timestamp - 600)
| eval(fEnd = @timestamp + 600)
| format("%1$TF %1$TT.%1$TQ", field=[@timestamp], as=Timestamp), timezone=UTC)
| table([fStart, Timestamp, fEnd])
Tip: To check which query format is less expensive, run them both using the same data set. Then check LogScale's measure of their cost by looking at the 'Work' report, found below the query result. The lower the number, the better the result!
Capitalization and Queries
When specifying fields and values, LogScale is case-sensitive- be aware
of case-sensitivity in commonly used fields like
event_platform. However, this
feature can be overridden using regex delimiters in the form of forward
slashes and a case insensitivity modifier
i
.
For example:
// This is case-sensitive
event_platform=Lin
// This is case-insensitive
event_platform=/Lin/i
Without the flag, only exact match entries will be returned. With it, entries containing all variations of capitalization are returned. This work-around resolves the problem for a single query. For a more permanent solution:
Using the query panel, complete a search similar to the example above. Note that the results of the query are not important here.
Click Save as near the top right corner of the user interface.
Name the query (preferably something like
Text Insensitive
) and click .Go to the
Settings
tab. In the Basic Information section, set the 'Default Query' to your newly saved query.
Now when you open the Search
tab the
query panel will be set with your default query, which is set to
case-insensitive. Double click the default text within the query and
change it to your needs. To remove case-sensitivity, simply remove the
case-sensitivity modifier (the letter 'i'). To reset your default query,
go back to the 'Settings' tab and select another query, or no query by
selecting the 'blank' choice.