Best Practice: Using regular expressions for field extractions and matching

Regular expressions are a powerful search tool, and a core capability of LogScale. They can be invoked almost anywhere in LogScale using LogScale Query Language (LQL).

To demonstrate how to use a regular expression for field extraction and matching, let's look at an example. Here we are combining a regex with a case statement to evaluate an application version to look for Chrome versions below 109.5414:

logscale
// Get InstalledApplication events for Google Chrome
#event_simpleName=InstalledApplication AppName="Google Chrome"
// Get latest AppVersion for each system
| groupBy(aid, function=([selectLast([AppVendor, AppName, AppVersion, InstallDate])]))
// Use regex to break AppVersion field into components
| AppVersion = /(?<majorVersion>\d+)\.(?<minorVersion>\d+)\.(?<buildNumber>\d+)\.(?<subBuildNumber>\d+)$/i
// Evaluate builds that need to be patched
| case {
    majorVersion>=110 | needsPatch := "No" ;
    majorVersion>=109 AND buildNumber >= 5414 | needsPatch := "No" ;
    majorVersion>=109 AND buildNumber >= 5414 | needsPatch := "Yes" ;
    majorVersion>=108 | needsPatch := "Yes" ;
    * 
  }
// Check for needed update  and Organize Output
| needsPatch = "Yes"
| select([aid, InstallDate, needsPatch, AppVendor, AppName, AppVersion, InstallDate])
// Convert timestamp
| InstallDate := InstallDate *1000
| InstallDate := formatTime("%Y-%m-%d", field=InstallDate, locale=en_US, timezone=Z)
Example of evaluation using case statements

By default, when using regular expression extractions, they are strict. Meaning if the data being searched does not match, it will be omitted. A quick example would be:

logscale
#event_simpleName=ProcessRollup2 ImageFileName=/\\(?<fileName>\w{3}\.\w{3}$)/i

This query looks for a file with a name that is three characters long, and has an extension that's three characters long. If that condition is not matched, data is not returned:

Example of exclusionary regex

We can also use non-strict field extractions like so:

logscale
#event_simpleName=ProcessRollup2 ImageFileName=/\\(?<fileName>\w+\.\w+$)/i
| regex("(?<fourLetterFileName>^\w{4})\.exe", field=fileName, strict=false)
| groupBy([fileName, fourLetterFileName])

The above looks for file names that contain four characters. If that does not match, that field is left as null.

Example of non-exclusionary regex