percentile()
is an estimation function that estimates
percentiles over a given collection of numbers.
Parameter | Type | Required | Default Value | Description |
---|---|---|---|---|
accuracy | double | optional[a] | 0.01 | Provided as a relative error threshold. Can be between >0 and <1: values closer to 1 means lower accuracy, values closer to 0 means higher accuracy. |
as | string | optional[a] | Prefix of output fields. | |
field [b] | string | required | Specifies the field for which to calculate percentiles. The field must contain numbers. | |
percentiles | array of numbers | optional[a] | [50, 75, 99] | Specifies which percentiles to calculate. |
[a] Optional parameters use their default value unless explicitly set. |
A percentile is a comparison value between a particular value and the values of the rest of a group. This enables the identification of scores that a particular score surpassed. For example, with a value of 75 ranked in the 85th percentile, it means that the score 75 is higher than 85% of the values of the entire group. This can be used to determine threshold and limits for triggering events or scoring probabilities and threats.
For example, given the values 12, 25, 50 and 99, the 50th percentile would
be any value between 25 and 50, in this case the
percentile()
function will return 25.79. Note that
LogScale's percentile function returns any valid value in order to
reduce resource usage and not the mean of valid values as percentile
algorithms in general often returns.
Note
LogScale uses an aproximative algorithm of percentiles in order to achieve a good balance of speed, memory usage and accuracy.
The function returns one event with a field for each of the percentiles
specified in the percentiles
parameter. Fields are named like by prepending
_ to the values specified in the
percentiles
parameter. For
example the event could contain the fields
_50,
_75 and
_99.
The following conditions apply when using this function:
The function only works on non-negative input values.
The
accuracy
argument specifies the accuracy of the percentile relative to the number estimated and is intended as a relative error tolerance (lower values implies a better accuracy). Some examples:An
accuracy
of0.001
specifies the accuracy of the percentile relative to the number estimated (note that specifying accuracy=0.001 actually implies that the accuracy is 0.999). The number estimated depends on theaccuracy
argument and the amount of data available. A larger amount of data returns better estimations.For example, with an original value of 1000 the value would be betwen 999 and 1001 (
1000-1000/1000
and (1000+1000/1000
)).An
accuracy
of0.01
means accuracy to 1/100 of the original value.For example, with an original value of 1000 the value between 990 and 1010 ((
1000-1000/100
and (1000+1000/100
)).With an original value of 500 the value would be between 495 and 505 ((
500-500/100
and500+500/100
)).
Important
Higher accuracy
implies
a high memory usage. Be careful to choose the accuracy for the kind of
precision they need from the expected output value. Lower percentiles
are discarded if the memory usage becomes too high. If your percentiles
seems off, try reducing the accuracy.
percentile()
Examples
Calculate the 50th,75th,99th and 99.9th percentiles for events with the field responsetime:
percentile(field=responsetime, percentiles=[50, 75, 99, 99.9])
In a timechart, calculate percentiles for both of the fields r1 and r2.
timechart(function=[percentile(field=r1,as=r1),percentile(field=r2,as=r2)])
To calculate the median for a given value, use
percentile()
with
percentiles
set to
50
:
percentile(field=allocBytes,percentiles=[50],as=median)
This creates the field median_50 with the 50th percentile value.