Regular Expression Engine V2 Syntax Patterns

Available: Regular Expression Engine V2 v1.227

Regular expression engine V2 is the default regex engine from 1.227.

The following tables detail the supported functionality within LogScale using the LogScale Regular Expression Engine V2

The following syntax constructions match a specific character.

Table: Single Character Constructs

Syntax	Description
`x`	The literal character `x`
`\`	The backslash character
`\xnn`	The character with hexadecimal value `nn`, where `0 <= n <= F`. E.g. `\x21` matches the `!` character.
`\x{nnnn}`	The character with hexadecimal value `nnnn`, where `0 <= n <= F`. Support for up to four digits for use with Unicode characters. E.g. `\x{123}` matches the `ģ` character.
`\unnnn`	The character with hexadecimal value `nnnn`, where `0 <= n <= F`. E.g. `\u0023` matches the `#` character.
`\a`	The alert/bell character (BEL) with hexadecimal value `08`.
`\t`	The horizontal tab character (HT) with hexadecimal value `09`.
`\n`	The newline character (LF) with hexadecimal value `0A`.
`\f`	The form feed character (FF) with hexadecimal value `0C`.
`\r`	The carriage return character (CR) with hexadecimal `0D`.
`\e`	The escape character (ESC) with hexadecimal value `1B`.
`\cX`	The control character `^X`. E.g. `\cH` matches the backspace character (BS).

The following syntax constructions match a single character from a set of possible characters.

Table: Character Class Constructs

Syntax	Description
`[abc]`	Matches either `a`, `b`, or `c`.
`[^abc]`	Matches any character that is not `a`, `b`, or `c` (negated class)
`[a-z]`	Matches any character in the range from `a` through `z`

There are several predefined character classes available.

Table: Predefined Character Classes

Syntax	Description
`.`	Matches any character except newline (unless the single-line flag is given).
`\d`	Matches digit character 0 through 9. Equivalent to `[0-9]`.
`\D`	Matches any character that is not a digit character. Equivalent to `[^0-9]`
`\w`	Matches an ASCII word character. Equivalent to `[a-zA-Z0-9_]`.
`\W`	Matches any character that is not an ASCII word character. Equivalent to `[^\w]`.
`\h`	Matches a horizontal whitespace character. Equivalent to `[\t\x20\xA0\u180e\u2000-\u200a\u202f\u205f\u3000]`
`\H`	Matches any character that is not a horizontal whitespace character. Equivalent to `[^\h]`
`\v`	Matches a vertical whitespace character. Equivalent to `[\u000a-\u000d\u0085\u2028\u2029]`
`\V`	Matches any character that is not a vertical whitespace character. Equivalent to `[^\v]`
`\s`	Matches any whitespace character, as defined by the Unicode `White_Space` general category. Equivalent to `[\h\v]`.
`\S`	Matches any non-whitespace character. Equivalent to `[^\s]`.
`\p{X}`	Matches a character in the Unicode General Category abbreviated `X`. Supported categories are Letters (`L`), Symbols (`S`), Punctuation (`P`), and Control Characters (`Cc`). Case-insensitivity is not supported for unicode general category matches.
`\P{X}`	Matches any character that is not in the Unicode General Category abbreviated `X`. Case-insensitivity is not supported for unicode general category matches.

There are two primitive operations in LogScale's regex syntax. These are used to express more complicated patterns than single character matches.

Table: Primitive Operations

Syntax	Description
`XY`	Concatenation. Joins the regex patterns X and Y end-to-end. E.g. `ab` matches `a` followed by `b`.
`X\|Y`	Alternation. Matches either regex pattern X or pattern Y. E.g. `ab\|cd` matches either `ab` or `cd`.

Anchors and boundary syntax constructions match boundaries of text in-place and not characters.

Table: Anchors / Boundaries

Syntax	Description
`^`	When used outside of character classes, `^` matches the beginning of a line. See the `m` flag for what consitutes a line.
`$`	Matches the end of a line. See the `m` flag for what constitutes a line.
`\b`	Matches an ASCII word (`\w`) boundary in-place. For instance, `\bKingdom` matches `Kingdom` only if `Kingdom` is preceded by a character that is not in `\w`. For example, this regex pattern it matches `Kingdom` in the text `The Feathered Kingdom`, but not `Kingdom` in the text `007Kingdom`.
`\B`	Matches a non-ASCII word boundary. Explicitly, `\Bher` matches `her` only if `her` is preceded by a character that is in `\w`. For example `\Bher` matches `her` in `dispatcher`.
`\A`	Matches the start of the input. E.g. `\AKingdom` matches `Kingdom` in the text `Kingdom Come`.
`\Z`	Matches the end of the input except for the final terminator if one exists. For example, `Kingdom\Z` matches `Kingdom` in the text `The Feathered Kingdom\n`.
`\z`	Matches the end of the input. Like `\Z` but it does not match if a final terminator exists. E.g. `Kingdom\z` matches `Kingdom` in the text `The Feathered Kingdom`, but not in the text `The Feathered Kingdom\n`.

Quantifiers allow for matching the preceding pattern a number of times. Quantifiers fall into three categories; greedy, non-greedy, and possesive:

Greedy quantifiers try to match the given pattern as many times as possible.
Non-greedy quantifiers try to match as few times as possible.
Possesive quantifiers try to match as many times as possible, but upon finding the longest possible match, do not try shorter matches if the rest of the regex does not match.

Table: Quantifiers

Syntax	Description	Category
`X?`	Makes X optional. Greedy, so it will prefer a match containing X.	Greedy
`X*`	Matches X zero or more times. Greedy, so it will prefer the match with the most repetitions of X.	Greedy
`X+`	Matches X one or more times.	Greedy
`X{n}`	Matches X exactly `n` times, where `n` is a number between 0 and 14748364.
`X{n,}`	Matches X `n` or more times, where `n` is a number between 0 and 14748364.	Greedy
`X{n,m}`	Matches X at least `n` times and at most `m` times, where 0 <= n,m <= 14748364.	Greedy
`X??`	Makes X option, Non-greedy, so it will prefer a match that does not contain X.	Non-greedy
`X*?`	Matches X zero or more times. Non-greedy, so it will prefer the match with the least repetition of X.	Non-greedy
`X+?`	Matches X one or more times.	Non-greedy
`X{n}?`	Matches X exactly `n` times.
`X{n,}?`	Matches X at least `n` times.	Non-greedy
`X{n,m}?`	Matches X at least `n` times and at most `m` times.	Non-greedy
`X?+`	Makes X optional. Possesive, so it will prefer a match containing X, but if X matched, and the rest of the regex after X?+ did not match, it will not try again without X.	Possessive
`X*+`	Matches X zero or more times. Possessive, so it will prefer the match with the most repetitions of X, and will not try the rest of the regex on any other	Possessive
`X++`	Matches X one or more times.	Possessive
`X{n}+`	Matches X exactly `n` times.
`X{n,}+`	Matches X at least `n` times.	Possessive
`X{n,m}+`	Matches X at least `n` times and at most `m` times.	Possessive

Groups and backreferences allow you to treat a given pattern as one unit, and allows you to apply operators to the entire grouped pattern. They also allow you to capture the text matched by the regex inside the group, and to control the behaviour of the regex engine when matching the grouped pattern. Special groups also allow for more advance behaviour, such as lookarounds.

Table: Groups and Backreferences

Syntax	Description
`(X)`	A numbered capture group of X. Parentheses group the regex `X` between them, and capture the text that the pattern XY matches. It also allows you to repeat the entire group, e.g. `(abc){3}` matches "abcabcabc". The group captures the final occurrence of `abc`. Numbered capture groups are numbered from left-to-right by their opening parentheses.
`(?X)`	A named capture group of X. Named capture groups are also numbered. Named capture groups perform field extraction in LogScale.
`(?PX)`	A named capture group of X.
`(?:X)`	A non-capturing group of X.
`(?flags:X)`	Sets regex flags `flags` for the group (non-capturing). Supported flags are `i`, `m`, and `s`.
`(?flags)X`	Sets the regex flags `flags` for X. Applies across concatenations and alternation branches, but does not escape groups.
`(?=X)`	Zero-width positive lookahead for X. E.g. `abc(?=d)` matches `abc` only if it is followed by `d`.
`(?!X)`	Zero-width negative lookahead for X. E.g. `abc(?!d)` matches `abc` only if it is not followed by `d`.
`(?<=X)`	Zero-width positive lookbehind for X. E.g. `(?<=d)abc` matches `abc` only if it is preceeded by `d`.
`(?<!X)`	Zero-width positive lookbehind for X. E.g. `(?<!d)abc` matches `abc` only if it is not preceeded by `d`.
`(?>X)`	Atomic group for X. Atomic groups prevents the regex engine from backtracking into the group after a match has been found for the group. The engine can backtrack over the group or to something pprior to the atomic group, but it cannot backtrack into the group and try other permutations.
`\n`	Backreference. Matches what was captured by the `nth` group, where `1 <= n <= number of groups`. For example, `(abc)\1` matches `abcabc`.

Certain constructions allow you to quote regex meta-characters.

Table: Quotation

Syntax	Description
`\X`	Quoutes X, where X is a regex meta-character. E.g. `(a)` matches `(a)` and is not a capturing group over `a`.
`\Q`	Quoutes all character succeeding it until reaching `\E`.
`\E`	Ends quotation started with `\Q`.

Versions of this Page

Data Analysis Overview

LogScale Web Interface

Manage Repositories and Views

Manage Account

Parse Data

Search Data

Write Queries

Query Language Syntax

Query Joins and Lookups

Query Functions

Data Visualization

Automation

Template Language

Keyboard Shortcuts

Regular Expression Engine V2 Syntax Patterns

Available: Regular Expression Engine V2 v1.227

Other articles on this topic

Similar Content

Related Language Syntax

Related KB Articles

Related Query Examples

Enter search term