count()

`count()`

Counts the number of events in the repository, or streaming through the function. You can use this field name to pipe the results to other query functions or general use.

It's possible to specify a field and only events containing that field are counted. It's also possible to do a distinct count. When having many distinct values LogScale will not try to keep them all in memory. An estimate is then used, so the result will not be a precise match.

Parameter	Type	Required	Default Value	Description
`as`	string	optional^[a]	`_count`	The name of the output field.
`distinct`	boolean	optional^[a]		When specified, counts only distinct values. When this parameter is set to `true`, LogScale always uses an estimate, which may give an inexact result as the value.
`field`^[b]	string	optional^[a]		The field for which only events are counted.
^[a]Optional parameters use their default value unless explicitly set. ^[b]The parameter name `field` can be omitted.

Hide omitted argument names for this function

Show omitted argument names for this function

Accuracy When Counting Distinct Values

When counting distinct values in a data stream, particularly when there are repeated elements in a limited memory environment, limitations exist in the accuracy of the count to avoid consuming too much memory in the process. For example, if counting 1,000,000 (million) events. If each event contains a different value, then memory is required to store the count for each of those million entries. Even if the field is only 10 bytes long, that is approximate 9MB of memory required to store the state. In LogScale, this affects the limits as outlined in State Sizes and Limits. As noted in that section, LogScale uses an estimation algorithm that produces an estimate of the number of distinct values while keeping the memory usage to a minimum.

While the algorithm in question doesn't give any guarantees on the relative error of the reported result, the typical accuracy (standard error) is less than 2%, with 2/3s of all results being within 1%, tests with up to 10^7 distinct values, the result at worst deviated by less than 0.02%. The worst results for each test can be seen in the table below:

Distinct Values	Result of distinct count	Deviation percentage
10	10	0
100	100	0
1000	995	-0.005025125628
10000	10039	0.003884849089
100000	100917	0.009086675189
1000000	984780	-0.01545522858
10000000	10121302	0.01198482172

Important

For less than 100 distinct values, the deviation percentage will be exacerbated. For example, if there are only 10 distinct values, a deviation of 1 is 10%, even though it is the smallest possible deviation from the actual number of distinct values.

More typically, values used for aggregations or counts for distinct values will have low cardinality (for example, a small number of distinct values against the overall set).

`count()` Examples

Below are several examples using the count() function. Some are simple and others are more complex, with functions embedded within others.

Click + next to an example below to get the full details.

Aggregate Status Codes by `count()` per Minute

Aggregate Status Codes by `count()` Per Minute

Time series aggregate status codes by count() per minute into buckets

Alert Query for Parsers Issues

Reporting errors

Bucket Events Summarized by `count()`

Calculate a Percentage of Successful Status Codes Over Time

Starting with the source repository events.
logscale
```
| success := if(status >= 500, then=0, else=1)
```
Adds a success field at the following conditions:
- If the value of field status is greater than or equal to 500, set the value of success to 0, otherwise to 1.
logscale
```
| timeChart(series=customer,function=
[
  {
    [sum(success,as=success),count(as=total)]
```
Creates a new timechart, generating a new series, customer that uses a compound function. In this example, the embedded function is generating an array of values, but the array values are generated by an embedded aggregate. The embedded aggregate (defined using the {} syntax), creates a sum() and count() value across the events grouped by the value of success field generated from the filter query. This is counting the 11 or 0 generated by the if() function; counting all the values and adding up the ones for successful values. These values will be assigned to the success and total fields. Note that at this point we are still within the aggregate, so the two new fields are within the context of the aggregate, with each field being created for a corresponding success value.
logscale
```
| pct_successful := (success/total)*100
```
Calculates the percentage that are successful. We are still within the aggregate, so the output of this process will be an embedded set of events with the total and success values grouped by each original HTTP response code.
logscale
```
| drop([success,total])}],span=15m,limit=100)
```
Still within the embedded aggregate, drop the total and success fields from the array generated by the aggregate. These fields were temporary to calculate the percentage of successful results, but are not needed in the array for generating the result set. Then, set a span for the buckets for the events of 15 minutes and limit to 100 results overall.
Event Result set.

Collect and Group Events by Specified Field - Example 2

Collect and group events by specified field using collect() as part of a groupBy() operation

LocalAddressIP4	RemoteAddressIP4	aipCount	aip
192.168.1.100	203.0.113.50	3	[10.0.0.1, 10.0.0.2, 10.0.0.3]
10.0.0.5	198.51.100.75	1	[172.16.0.1]
172.16.0.10	8.8.8.8	5	[192.0.2.1, 192.0.2.2, 192.0.2.3, 192.0.2.4, 192.0.2.5]

Count Events per Repository

Count of the events received by repository

Count Total of Malware and Nonmalware Events

Count total of malware and nonmalvare events in percentage

Create Time Chart Widget for Different Events

Create Timechart Widget for All Events

Get List of Status Codes

Get list of status codes returned and a count of each for a given period using the groupBy() function with count()

status	_count
101	17
200	46183
204	3
307	1
400	2893
401	4
Failure	1
Success	8633

Count All Events

This a simple example using the count() function. The query just counts the number of events found in the repository for the period of time selected:

logscale

count()

The result is just a single number, the total count.

_count
3886817

To format adding a thousands separator:

logscale

count()
| format("%,i", field=_count, as=_count)

Produces

_count
3	886,817

Group & Count

In this example, the query uses the count() function within the groupBy() function. The first parameter given is the field upon which to group the data. In this case, it's the HTTP method (for example, GET, PUT, POST). The second parameter says to use the function count() to count the number occurrences for each method found.

logscale

groupBy(field=method, function=count())

The result is a table with the column headings, method and _count, with the values for each:

method	_count
DELETE	7375
GET	153493
POST	31654

Chart of Daily Counts

Figure 187. count() Chart of Daily Counts

You can use the count() function in conjunction with the timeChart() function to count the number occurrences of events or other factors. By default, the timeChart() function will aggregate the data by day. The results will look something like what you see in the screenshot shown in Figure 187, “count() Chart of Daily Counts”.

logscale

timeChart(function=count())

Table of Daily Counts

When a user accesses a web site, the event is logged with a status. For instance, the status code 200 is returned when the request is successful, and 404 when the page is not found. To get a list of status codes returned and a count of each for a given period, you would enter the following query in the Search box:

logscale

groupBy(field=status, function=count())

The sample output is shown below:

status	_count
101	9
200	55258
204	137834
307	2
400	2
401	4
403	57
404	265
504	62
stopping	6
success	6

Data Analysis Overview

LogScale User Interface

Repositories & Views

Parsing Data

Searching Data

Writing Queries

Dashboards & Widgets

Automation

Query Language Syntax

Query Functions

Template Language

Keyboard Shortcuts

Accuracy When Counting Distinct Values

Important

count() Examples

Aggregate Status Codes by count() per Minute

Query

Introduction

Step-by-Step

Summary and Results

Aggregate Status Codes by count() Per Minute

Query

Introduction

Step-by-Step

Summary and Results

Alert Query for Parsers Issues

Query

Introduction

Step-by-Step

Summary and Results

Bucket Events Summarized by count()

Query

Introduction

Step-by-Step

Summary and Results

Calculate a Percentage of Successful Status Codes Over Time

Query

Introduction

Step-by-Step

Summary and Results

Collect and Group Events by Specified Field - Example 2

Query

Introduction

Step-by-Step

Summary and Results

Count Events per Repository

Query

Introduction

Step-by-Step

Summary and Results

Count Total of Malware and Nonmalware Events

Query

Introduction

Step-by-Step

Summary and Results

Create Time Chart Widget for Different Events

Query

Introduction

Step-by-Step

Summary and Results

Create Timechart Widget for All Events

Query

Introduction

Step-by-Step

Summary and Results

Get List of Status Codes

Query

Introduction

Step-by-Step

Summary and Results

Count All Events

Group & Count

Chart of Daily Counts

Table of Daily Counts

Enter search term

`count()`

`count()` Examples

Aggregate Status Codes by `count()` per Minute

Aggregate Status Codes by `count()` Per Minute

Bucket Events Summarized by `count()`