Appendix A. Quirks

Below, we elaborate on most of the quirks in the LogScale Query Language in the hopes that this will discourage anyone from attempting to write their own parser.

Slashes

Slash (/) is used for multiple features that makes tokenization hard. These features include comments, regular expression literals, and division. We recommend using LSP to tokenize queries rather than clients attempting to duplicate these quirks. In particular, we don't believe that regular expressions can be used to identify comments or regular expression literals.

Examples:

https://www.example.com/ is a valid query that is equivalent to searching for "*https://www.example.com/*".

https:// is a syntax error.

https: //www.example.com/ is a valid query that contains a comment. It is equivalent to searching for "*https:*",

/fisk/i is a valid query that is similar to the query regex("fisk", flags="i").

a:=m/fisk/i is a valid query that assigns the field a the value of dividing the value of the field m by the values of the fields fisk and i.

a:=/fisk/i is a syntax error.

a:=foo==/fisk/i is a syntax error.

foo=m/fisk/i is a query that's equivalent to searching for foo="m/fisk/i".

Irregularity of Comparison

A common source of confusion is that the left-hand side of comparison in the PrimaryFilter is always a field name, and the right-hand side never is. In particular, the operators = and != are confusing, for example:

logscale
myField = myOtherField

This checks if myField holds the value myOtherField. It doesn't check if the two fields hold the same value.

Using test() to Compare Fields

Fields can be compared using test(), for example:

logscale
test(myField == myOtherField)

This checks that the fields contain the same value. But test() can also be used to check if a field contains a particular value, for example:

logscale
test(myField == "myOtherField")

This checks if myField holds the value myOtherField, just as myField = myOtherField does.

Unfortunately, this syntax doesn't support field names with spaces.

Unconventional Precedence of AND/OR Operators

The precedence of AND versus OR is reversed compared to general programming languages such as C and Java.

Presumably, this is due to Implicit AND.

Implicit AND

The combination of implicit AND between filter expressions and parenthesized filter expressions means that one can write a filter expression that looks like a Function Call. For example:

logscale
hest fisk(field)

is a valid query that's equivalent to:

logscale
"*hest*" AND "*fisk*" AND "*field*"

You could write something like this:

logscale
ERROR // Search for errors
groupBy(host) // Then grouped by host

So function names were turned into reserved words.

Function Calls and Reserved Words

Function names are reserved words. For example, when the function test() was introduced, some previously legal queries became syntax errors.

However, reserved words aren't enforced consistently.

Examples:

test is a syntax error.

test=fisk is a valid query.

fisk(field) is a syntax error (fisk is not a known function).

Alternatives

Making function names reserved words is a source of breaking changes each time a new function is added. An alternative would have been to have a rule that if it looks like a Function Call, it is a function call. In this case, an error like this could be correctly diagnosed:

logscale
ERROR // Search for errors
groupedBy(host) // Then grouped by host

No such function as groupedBy.

With this approach, new functions can be added without breaking existing queries.

Regarding slashes (/) there's probably little value in allowing them in unquoted strings.

Using ' or ` for field names could probably mitigate some of the confusion around the comparison operators. However, to completely eliminate the confusion, a name without any quotes should always be interpreted as field name as is the case for eval() and test().

Recommendations for Generating Queries

Avoid using Implicit AND.

Prefer | over AND to avoid confusion around precedence.

Prefer parenthesis around logical expressions to avoid confusion around precedence.

Avoid unquoted strings. In filter expressions, all unquoted strings can be quoted without changing the meaning. However, in eval() and test() unquoted strings are interpreted as field names, so care must be taken. One approach is to use:

logscale
tmp := rename("sus field name") 
  | test(host==tmp) 
  | drop(tmp).