Best Practice: Format query output using groupBy()
One of the more powerful aggregate functions in LogScale is the use of
groupBy()
. groupBy()
is akin
to stats()
in Event Search. One thing to keep in
mind when using groupBy()
is the use of parentheses
and square brackets. To invoke an aggregate function, you
open with parentheses. To perform that
aggregation on multiple fields, you encase your
fields or conditions in square brackets.
#event_simpleName=ProcessRollup2 event_platform=Win ImageFileName=/\\powershell\.exe/i
| groupBy(SHA256HashData, function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect(CommandLine)]))
![]() |
If we were to isolate the groupBy()
statement above
to make the clustering a little easier to understand, it would look like
this:
| groupBy(SHA256HashData, function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect(CommandLine)]))
Note the use of the square brackets after invoking
function
. This is because we want to use
multiple aggregations in this groupBy()
query.
If you wanted to use groupBy()
for multiple fields,
you would also use square brackets. As an example:
| groupBy([SHA256HashData, FileName], function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect(CommandLine)]))
Note the first two fields specified immediately after
groupBy()
.
The same principle would be applied if we wanted to collect multiple fields.
| groupBy([SHA256HashData, FileName], function=([count(aid, distinct=true, as=uniqueEndpoints), count(aid, as=totalExecutions), collect([CommandLine, UserSid])]))
Note how:
collect(CommandLine)
Becomes:
collect([CommandLine, UserSid])
This takes a little practice, but once mastered the syntax is logical and easy to interpret. To assist, LogScale will insert a closing parenthesis or closing square bracket when you open one.