Built-in Parsers

LogScale includes built-in parsers for common log formats, including a parser for the widely-used accesslog format for web servers like Apache and Nginx.

The following table summarizes the key features of each built-in parser:

Table: Built-in Parsers Summary

Parser Name Source Types Special Handling Required Key Features
accesslog Apache and Nginx web server logs in accesslog format Copy and modify the parser if you have customized your web server logging Supports standard accesslog format; supports response time at end of log line; good starting point for custom web server log parsing
audit-log LogScale audit logs in JSON format None Processes LogScale internal audit logs; expects JSON format
corelight-es Corelight Zeek sensors streaming in Elasticsearch format None Supports Corelight's Zeek sensors; handles Elasticsearch format output; Zeek log filename becomes #path tag
corelight-json Corelight Zeek sensors streaming in JSON format Requires use with Ingest Listeners Supports Corelight's Zeek sensors; handles JSON format output; Zeek log filename becomes #path tag
json JSON data with timestamps Requires @timestamp field in ISO 8601 format; create a custom parser based on this one if you lack control over JSON format Expects @timestamp property in JSON; processes JSON data in log lines
json-for-action JSON data from LogScale Repository Action Expects Unix Time in milliseconds Default parser for LogScale Repository Action; processes JSON in @rawstring field; expects @timestamp in Unix Time (milliseconds)
kv Key-value format logs Expects timestamp with timezone in first 128 characters (configurable) Default parser if none specified; processes standard key-value patterns; requires timestamp with timezone within first 128 characters; processes rest of line for key-value pairs
kv-generic Key-value format logs without timezone Assumes UTC for timestamps without timezone Similar to kv parser; accepts timestamps without timezone; assumes UTC if timezone not specified
kv-millis Key-value format logs with Unix timestamps Expects Unix Time in milliseconds at start of line Similar to kv parser; expects timestamp at start of line; timestamp must be in UTC time in milliseconds
serilog-jsonformatter Serilog's JsonFormatter output Requires renderMessage: true in Serilog configuration Processes logs from Serilog's JsonFormatter; must configure Serilog with renderMessage: true; displays rendered message instead of raw event; properties available as nested fields (for example, Properties.Position.Latitude)
syslog Various syslog formats (RFC 3164, RFC 5424) Defaults to UTC if no timezone specified; copy and modify to set different timezone Liberal acceptance of syslog formats; supports RFC 3164 and RFC 5424; defaults to UTC for timestamps without timezone; can be copied and modified for custom timezone; leverages built-in key-value parser
syslog-utc Standard syslog system logs Expects UTC timezone; copy and modify timezone="UTC" for other timezones Expects timestamp at start of line; expects optional fields: host, app, pid; expects UTC timezone; leverages built-in key-value parser; better performance than generic syslog parser for this specific format
zeek-json Zeek (formerly Bro) JSON output Tailored for Zeek script output; see Zeek (Bro) Network Security Monitor for setup information Processes JSON data from Zeek; tailored for Zeek script output format; Zeek log filename becomes #path tag

You can examine each of the built-in parsers directly in the LogScale UI. Open the parser page and check the supported regular expression and timestamp formats. When you paste in test data, LogScale shows the result of that parsing.

When shipping data to LogScale, you want to check to see if there is a built-in parser for the logs before writing a custom parser. The built-in parsers are a good starting point when creating custom parsers.

Built-in Parser: accesslog

This parser can handle the accesslog format, which is the default log format used by Apache and Nginx. The parser also supports putting the response time at the end of the log line. If you have modified the logging of your web server, copy the built-in accesslog parser and modify it to suit your customizations.

Example Input

accesslog
localhost - - [25/Feb/2017:21:05:16 +0100] "POST /api/v1/ingest/elastic-bulk HTTP/1.1" 200 50 "-" "Go-http-client/1.1" 0.000 848
192.168.1.102 - - [25/Feb/2017:21:06:15 +0100] "GET /api/v1/repositories/gotoconf/queryjobs/855620e9-1d1f-4b0e-91fe-c348795e68c9 HTTP/1.1" 200 591 "referrer" "Mozilla/5.0" 0.008 995

Built-in Parser: audit-log

This parser can process audit logs in JSON format from LogScale itself.

Built-in Parser: corelight-es

This is a built-in parser that supports Corelight's Zeek sensors. Corelight sensors have default support for streaming out Zeek logs in either JSON or Elasticsearch format. LogScale can receive the streaming data in Elasticsearch format using this parser. For the JSON format, see Built-in Parser: corelight-json.

The name of the Zeek log file will become a #path tag in LogScale.

Built-in Parser: corelight-json

This is a built-in parser that supports Corelight's Zeek sensors. Corelight sensors have default support for streaming out Zeek logs in either JSON for Elasticsearch format. LogScale can receive the streaming data in JSON format using this parser and Ingest Listeners. For the Elasticsearch format, see Built-in Parser: corelight-es above.

The name of the Zeek log file will become a #path tag in LogScale.

Built-in Parser: json

This parser can process JSON data in log lines. It expects to find a JSON property called @timestamp containing an ISO 8601-formatted time string.

If you don't have control over the JSON format, you can create a Example Input.

Example Input

javascript
{
 "@timestamp": "2017-02-25T20:18:43.598+00:00",
 "loglevel": "INFO",
 "service": {
 "name": "user service",
 "time": 123
}
}

Built-in Parser: json-for-action

This parser is the default parser for the Action Type: Falcon LogScale Repository. It processes JSON data in the @rawstring field. It expects to find a JSON property called @timestamp containing a time in Unix Time in milliseconds.

Built-in Parser: kv

This parser is the key-value parser. It is the default parser, that LogScale uses if no other parser is specified. It can process standard key-value patterns in log lines. It expects the log line to contain a timestamp with a time zone within the first 128 characters (configurable, see findTimestamp()). The parser processes the rest of the line for key-value pairs.

Example Input

syslog
2017-02-25T20:18:43.598+0000 created a new user user="John Doe" service=user-service as a freemium user

Given the above log line, LogScale parses the fields

logscale
user=John Doe

and

logscale
service=user-service

Built-in Parser: kv-generic

This parser, like the Built-in Parser: kv, is a key-value parser. It works in the same way, except that it will also parse timestamps without a timezone. Such timestamps will be assumed to be in UTC.

Built-in Parser: kv-millis

This parser, like the Built-in Parser: kv, is a key-value parser. However, it expects the timestamp to be at the start of the log line and to be in UTC time in milliseconds.

Example Input

syslog
1488054417000 created a new user user="John Doe" service=user-service as a freemium user

Given the above log line, LogScale parses the fields

logscale
user=John Doe

and

logscale
service=user-service

Built-in Parser: serilog-jsonformatter

This parser can process log lines written by Serilog's JsonFormatter.

Example serilog configuration:

none
Log.Logger = new LoggerConfiguration()
             .WriteTo.File(formatter: new JsonFormatter(renderMessage: true), path:logPath, rollingInterval: RollingInterval.Day)

Important

The required renderMessage: true part of the configuration. LogScale will display the rendered message output by Serilog instead of the raw event.

Example Input

logscale
{"Timestamp":"2019-01-21T13:26:25.1354930+01:00","Level":"Information","MessageTemplate":"Processed {@Position} in {Elapsed:000} ms.","RenderedMessage":"Processed { Latitude: 25, Longitude: 134 } in 034 ms.","Properties":{"Position":{"Latitude":25,"Longitude":134},"Elapsed":34,"ProcessId":"15133"},"Renderings":{"Elapsed":[{"Format":"000","Rendering":"034"}]}}

Properties output by Serilog are available within the parsed event, such as Properties.Position.Latitude from the above example input.

Built-in Parser: syslog

This parser aims to be compatible with a variety of syslog formats. This includes RFC 3164 and RFC 5424. The parser does not implement every aspect of the syslog RFCs, but is instead liberal in what it accepts.

Example Input

syslog
<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - BOM'su root' failed for foo on /dev/pts/8
<34>1 2003-10-11T22:14:15.003Z server1.com sshd - - pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.0.2.
<34>Oct 11 22:14:15 mymachine su: 'su root' failed for foo on /dev/pts/8
Oct 11 22:14:15 su: 'su root' failed for foo on /dev/pts/8

If no timezone is specified, as in the last two examples, the parser defaults to UTC time. To change that, you may create a new parser by copying this parser and modifying:

logscale
timezone="UTC"

to your desired timezone.

The parser also leverages LogScale's built-in key value parser Built-in Parser: kv.

The parser Built-in Parser: syslog-utc will have better performance when the logs display this specific format.

Built-in Parser: syslog-utc

This parser can process standard lines generated by the syslog system.

The parser expects lines to starts with a timestamp, followed by the optional fields host, app, and pid. It also expects the timestamp to be in the UTC time zone. If your timestamps are in your local timezone, and that is not UTC, you will want to create a new parser by copying this parser and modifying

logscale
timezone="UTC"

to your desired timezone.

This parser also leverages LogScale's built-in key value parser Built-in Parser: kv.

Example Input

syslog
Feb 25 19:17:01 Myhost CRON[24886]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Feb 25 06:35:01 Myhost CRON[24272]: (root) CMD (command -v deb-sa1 > /dev/null && deb-sa1 1 1)

Built-in Parser: zeek-json

This parser can process JSON data generated from Zeek. It is tailored to handle the output generated from the Zeek script, and you can read about how to send Zeek data to LogScale on the Zeek (Bro) Network Security Monitor documentation page.

The name of the Zeek log file will become a #path tag in LogScale.