Use this query function to find the most common values of a field in a set of events, the top of an ordered list of results. It's also possible to find the occurrences of a field using the value of another field.
The top()
query function is a more succinct and
powerful way to execute the groupBy()
query in
conjunction with count()
and
sort()
:
groupby([*fields*], function=count())
| sort(_count)
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
as | string | false | _count or _sum | The optional name of the output field. |
error | number | false | 5 | The error threshold in percent for displaying a warning message when not precise enough. |
field | [string] | true | This is fields on which to group and count. An event is not counted if the fields aren't present. [a] | |
limit | number | false | 10 | Sets the number of results to return. |
max | string | false | This changes function used from count() to finding the max value of a max field (i.e., groupby([*fields*], function=max(*max*)) | sort(_max) ). | |
percent | boolean | false | false | Will add a column named percent containing the count in percentage of total. |
rest | string | false | Will add an extra row containing the count of all the other values not included. | |
sum | string | false | This changes function used from count() to sum() (i.e., like groupby([*fields*], function=sum(*sum*)) | sort(_sum) ). | |
When the top()
query function is executed, if there
were more fields than those that were be grouped and counted, the
rest
parameter will return an
extra row containing a count of all other values, values that were not
included in the top results. To enable it, set it to a whatever you want
the row to be labeled.
A warning message will be displayed if the results returned are not
precise enough. The error
parameter is
used to specify the error threshold in percent — the default is five
percent. You may lower that value if you want to know about results that
are not more precise.
When the data set becomes huge, the top()
function
uses a streaming approximation algorithm. It is implemented with
datasketches.
By default, a warning is issued if the precision is less than five
percent. This can be specified using the error parameter. The
implementation uses a maxMapSize with value 32768 for historical queries,
and 8192 for live queries. See
Frequent
Items, Error Threshold Table for more information. Only results
falling within the threshold are returned.
top()
Examples
There are many ways in which the top()
function may
be used. As an example of how it may be used, suppose you have a
LogScale repository that's ingesting log entries from a web server for a
photography site. On this site are several articles about photography.
The URL for articles on this site end with the extension,
.page
instead of
.html
. Based on this, you can use the
regex()
query function to extract the page users
viewed and then use the top()
function to list the
top most viewed pages. You could do that like this:
regex(regex="/.*/(?<url_page>\S+\.page)", field=url)
| top(url_page, limit=12, rest=others)
The first line is for the regex()
function. Since
this reference page is about the top()
function, we
won't discuss the details of it — other than it returns the name
of the file from the url field and stores that
result in a field labeled, url_page.
The second line of the query above shows how you might use the top()
function. Notice the first parameter given is that
url_page field coming from the first line of the
query. The second parameter is to limit the results to the top twelve
— instead of the default limit of ten. Because we're curious of
how many pages were viewed during the selected period that were not
listed in the top twelve, the rest parameter is
specified with the label to use. In the screenshot in
Figure 405, “top()
Example” here, you can see that
the last line of the results reads, others.

Figure 405. top()
Example
You can see in the results shown in the screenshot that the matches displayed, from the most viewed page during the selected period to the least — limited to the top twelve. The thirteenth line is a total of all other pages.
As another example, suppose you want to get a list of URL's that users attempted to view, but the web server could not find them. You could do a query like this:
statuscode = "404"
| top(url, limit=20)
In this query, we first get only events in which the statuscode is 404:
that's http code which indicates that the requested URL was not found.
Those events are then piped to the top()
function
on the second line of the query. For this function, we want to group the
results on the value of the url field and to list the top twenty. The
results will look something like the screenshot in
Figure 406, “top()
Example” here.

Figure 406. top()
Example
Looking at the screenshot, we can see that there a few attempts to access pages like wp-login.php and similar pages. These are attempts to log into WordPress, Drupal, and other content management systems. Since this particular web server does not use a CMS, these pages don't exist on the server and are indications of failed hacker attempts.