Regular Expression Syntax Patterns
The following tables detail the supported functionality within LogScale compared to the standard JitRex or RE2J implementations. The tables detail each syntax and whether it's supported by LogScale.
The following table lists the single characters supported:
Table: Single Characters
Syntax | Description |
---|---|
x
| The character x |
\\
| The backslash character |
\0n
| The character with octal value 0n (0 <= n <= 7) |
\0nn
| The character with octal value 0nn (0 <= n <= 7) |
\0mnn
| The character with octal value 0nn (0 <= n <= 7) |
\xhh
| The character with hexadecimal value 0xhh |
\uhhhh
| The character with hexadecimal value 0xhhhh |
\t
| The tab character |
\n
| The newline character |
\r
| The carriage-return character |
\f
| The form-feed character |
\a
| The alert (bell) character |
\cK
| The control character ^K |
The following table lists the different character classes and ranges supported when needing to match multiple characters.
Table: Character Classes
Syntax | Description |
---|---|
[abc]
| a, b, or c (simple class) |
[^abc]
| Not a, b, or c (negated class) |
[a-zA-Z]
| a through z or A through Z, inclusive (range) |
Pre-defined character classes cover multi-character groups such as whitespace, words or non-letter/digit characters.
Table: Predefined character classes
Syntax | Description |
---|---|
.
|
Any character except newline (unless given flag d )
|
\d
| A digit: [0-9] |
\D
| A non-digit: [^0-9] |
\s
| A whitespace character |
\S
| A non-whitespace character |
\w
|
A word character: [a-zA-Z_0-9]
|
\W
|
A non-word character (inverse of above, equivalent to
[^a-zA-Z_0-9] )
|
The following table lists logical matching operators.
Table: Logical operators
Syntax | Description |
---|---|
XY
| X followed by Y |
X|Y
| Either X or Y |
The following table lists the supported boundary matchers, such as word or line boundaries. See m regex flag, which governs what is considered a line.
Table: Boundary matchers
Syntax | Description |
---|---|
^
| The beginning of a line |
$
| The end of a line |
\b
| A word boundary |
\B
| A non-word boundary |
\A
| The beginning of the input |
\Z
| The end of the input but for the final terminator, if any |
\z
| The end of the input |
Quantifiers provide numeric validation to a given character or character class.
Table: Quantifiers
Syntax | Description |
---|---|
X?
| X, once or not at all |
X*
| X, zero or more times |
X+
| X, one or more times |
X{n}
| X, exactly n times |
X{n,}
| X, at least n times |
X{n,m}
| X, at least n times but not more than m times |
X??
| X, once or not at all, prefer less |
X*?
| X, zero or more times, prefer less |
X+?
| Not supported |
X{n}?
| X, exactly n times |
X{n,}?
| X, at least n times, prefer less |
X{n,m}?
| X, at least n times but not more than m times, prefer less |
Groups support special matching rules for adding further explicit qualification within the regular expression.
Table: Groups
Syntax | Description | Notes |
---|---|---|
(X)
| X, as a numbered capturing group | Works as a non-capturing group. |
(?<name>X)
| X, as a named capturing group |
Note that LogScale supports a broader set of names than is
usual (e.g. containing |
(?P<name>X)
| X, as a named capturing group | |
(?:X)
| X, as a non-capturing group | |
(?flags)
| Sets flags in the group | Uses different flags to LogScale (see Regular Expression Flags) |
(?flags:X)
| Sets flags in X | Uses different flags to LogScale (see Regular Expression Flags) |
(?=X)
| X, as a zero-width positive lookahead | |
(?!X)
| X, as a zero-width negative lookahead |
Methods for quoting special characters and regex characters within a regular expression.
Table: Quotation
Syntax | Description |
---|---|
\
| Quotes the following character for certain characters (i.e. regex meta-characters). |
\Q
| Quotes all characters until \E |
\E
| Ends quoting started by \Q |