The text:editDistanceAsArray()
function
returns the edit distance (Levenshtein distance) between a
target string and a list of reference strings as an object
array. The Levenshtein distance represents the minimum number of
single-character edits required to transform one string into
another.
The text:editDistanceAsArray()
function
calculates the number of edit operations
(addition/deletion/substitution) needed to convert a target
string to a reference string, returning a numeric value (double)
representing the edit distance.
The text:editDistanceAsArray()
function is
the array variant of the
text:editDistance()
function, providing
similar functionality, but handling multiple reference strings
simultaneously.
Note
The behavior of the
text:editDistanceAsArray()
function is the same astext:editDistance()
with the difference that thetext:editDistanceAsArray()
function takes as an input a list of references (instead of a single reference) and outputs an object array (instead of a double).
Parameter | Type | Required | Default Value | Description |
---|---|---|---|---|
allowTranspositions | boolean | optional[a] | true | Enables adjacent character transposition as a single edit operation during distance calculation (instead of two seperate operations). |
asArray | string | optional[a] | _distance | The name of the output array. |
caseInsensitive | boolean | optional[a] | false | Converts both strings to lowercase before calculation when set to true (enabled). |
maxDistance | integer | required | The maximum edit distance (Levenshtein distance) to calculate. If the Levenshtein distance exceeds this value, maxDistance is returned. | |
Minimum | 1 | |||
Maximum | 100 | |||
references | array | required | The array of strings to which the distances from target string are calculated. | |
Minimum | 1 | |||
Maximum | 100 | |||
target | expression | required | The source string for distance calculations. | |
[a] Optional parameters use their default value unless explicitly set. |
The argument name for array
can be omitted.
A specific syntax applies for this query function, see Array Syntax for details.
text:editDistanceAsArray()
Function Operation
The text:editDistanceAsArray()
function
has specific implementation and operational considerations,
outlined below.
Input Processing
The text:editDistanceAsArray()
function
processes input expressions in the following manner:
Accepts a target string expression and an array of reference string expressions.
Evaluates and coalesces all expressions into strings.
Produces no output if the target string is null or invalid.
Omits specific (target, reference) pairs where the reference string is null or invalid.
Distance Calculation
The text:editDistanceAsArray()
function
performs these core operations:
Calculates Levenshtein distance between target and each reference string.
Returns results as double values in an object array.
Includes both calculated distances and associated references in the output.
Parameter Usage
Each parameter influences the calculation in specific ways:
maxDistanceReturns
maxDistance
if actual distance exceeds this value.Terminates calculation early when threshold is reached.
Lower values improve computation performance.
When enabled (set to
true
), treats uppercase and lowercase characters as identical.May produce incorrect distances in special cases. For more information, see Special Considerations and Limitations.
When enabled (set to
true
), counts adjacent character swaps as a single edit operation.When disabled (set to
false
), counts adjacent character swaps as two separate operations.Defaults to true.
For more information, see Special Considerations and Limitations.
Output Format
The text:editDistanceAsArray()
function
returns an object array with the following structure:
Each array element contains a distance value and its corresponding reference string.
Results are organized as
{distance, reference}
pairs. If a reference string expression cannot be compiled to a valid string or is null, the output for that specific pair (target, reference) is not added to the output array.Array name defaults to
_distance
unless specified byasArray
parameter.
Special Considerations and Limitations
The function returns incorrect edit distance calculation results when processing uppercase characters that have different lengths when represented as codepoints in uppercase versus lowercase forms.
The following characters are known to cause this issue:
Sr.No. | Uppercase | Lowercase | Comments |
---|---|---|---|
1 | SS | ß |
The German sharp s (U+00DF ) capitalizes to SS . However, LogScale parser will treat SS as 2 English capital letters S . If possible, it is recommended to replace SS with (U+1E9E ) as the capital variant of (U+00DF ).
|
2 | K | k |
The K here refers to the kelvin sign (U+212A ) and not the English capital letter K . The kelvin sign lowercases to english lowercase letter k , which has a different length in codepoints.
|
3 | İ | i |
The Turkish İ (U+0130 ) lowercases to the English small letter i , which has a different length in codepoints.
|
The
allowTransposition
parameter determines if adjacent transpositions are allowed
during distance calculation. It defaults to
true
. Allowing adjacent transposition
means that during distance calculation, adjacent characters
can be swapped with a distance of 1
(the
transposition requires only one operation instead of two
operations).
For example, the distance between abc
and
acb
would be:
2
if adjacent transpositions are not allowed (abc
substitute(b,c)
acc
substitute(c,b)
acb
)1
if adjacent transpositions are allowed (abc
swap(b,c)
acb
)
Grapheme Clusters
The text:editDistanceAsArray()
function
works on extended grapheme clusters as defined by the
Unicode
Standard Annex #29, specifically
UAX29-C1-1 similar to the similar to the
text:editDistance()
function. This
means that the edit distance between 🇩🇰😄😁 and
😄😄😁 would be 1
.
Furthermore, the
text:editDistanceAsArray()
function
calculates more accurate edit distances for non-Latin
writing systems. For example, the distance between
नमस्ते
and
नमसते
would be 2 (replace
स्ते
with स
and add ते
).
text:editDistance()
Syntax Examples
This example demonstrates finding possible phishing emails by comparing domain names:
text:editDistanceAsArray(
target=forwarded,
references=["crowdstrike.com","crwd.com"],
maxDistance=5
)
If input data was forwarded=crowdstrike.com
, forwarded=crowdstreak.com
,
forwarded=crownstrike.com
,
forwarded=logscale.com
and
forwarded=crwd.com
it would return:
_distance[0].distance | _distance[0].reference | _distance[1].distance | _distance[1].reference |
---|---|---|---|
0 | crowdstrike.com | 5 | crwd.com |
3 | crowdstrike.com | 5 | crwd.com |
1 | crowdstrike.com | 5 | crwd.com |
5 | crowdstrike.com | 5 | crwd.com |
5 | crowdstrike.com | 0 | crwd.com |
text:editDistanceAsArray()
Examples
Click
next to an example below to get the full details.Compare Domain Names Using Text Edit Distance Array
Calculate edit distance between domain names and reference values
using the text:editDistanceAsArray()
function
Query
text:editDistanceAsArray(target=forwarded, references=["crowdstrike.com","crwd.com"], maxDistance=5)
Introduction
In this example, the text:editDistanceAsArray()
function is used to compare forwarded domains against known legitimate
domains to identify potential typosquatting attempts.
Example incoming data might look like this:
@timestamp | forwarded | source_ip | request_type |
---|---|---|---|
2025-10-15T10:00:00Z | crowdstrike.com | 192.168.1.100 | DNS |
2025-10-15T10:01:00Z | crowdstreak.com | 192.168.1.101 | DNS |
2025-10-15T10:02:00Z | crownstrike.com | 192.168.1.102 | DNS |
2025-10-15T10:03:00Z | logscale.com | 192.168.1.103 | DNS |
2025-10-15T10:04:00Z | crwd.com | 192.168.1.104 | DNS |
Step-by-Step
Starting with the source repository events.
- logscale
text:editDistanceAsArray(target=forwarded, references=["crowdstrike.com","crwd.com"], maxDistance=5)
Calculates the edit distance between the value in the forwarded field and each reference domain (
crowdstrike.com
andcrwd.com
).The
maxDistance
parameter is set to5
. This means that for pairs (target, reference) where the calculated distance is less than5
, the result contains that distance, otherwise the result contains5
(maxDistance
).The function returns an array field named _distance containing objects with distance and reference properties for each comparison.
Event Result set.
Summary and Results
The query is used to identify domain names that are similar to known legitimate domains, which can help detect potential phishing or typosquatting attempts.
This query is useful, for example, to monitor DNS queries for slightly misspelled versions of legitimate domain names that might be used in phishing campaigns.
Both the text:editDistance()
and
text:editDistanceAsArray()
functions can be used to
calculate Levenshtein edit distances between strings. While they serve
similar purposes, they differ in their ability to handle reference
values and in their output format. See
Calculate Edit Distance Between Domain Names.
Sample output from the incoming example data:
_distance[0].distance | _distance[0].reference | _distance[1].distance | _distance[1].reference |
---|---|---|---|
0 | crowdstrike.com | 5 | crwd.com |
3 | crowdstrike.com | 5 | crwd.com |
1 | crowdstrike.com | 5 | crwd.com |
5 | crowdstrike.com | 5 | crwd.com |
5 | crowdstrike.com | 0 | crwd.com |
Note that a distance of 0
indicates an exact match
with the reference domain. The results are in the order of the events.
Also note that each row contains comparison results against all
reference domains, even if some are beyond the
maxDistance
threshold.
This data would be well-suited for visualization in a table widget showing the domain names and their edit distances. For security monitoring, you could create alerts for when domains with small but non-zero edit distances are detected. A bar chart could also be used to show the distribution of edit distances over time, helping identify patterns in typosquatting attempts.