Creating a Parser
A parser is a piece of code that transforms incoming data into Events. LogScale has built-in parsers for common log formats like accesslog. But if none of the built-in parsers fit your data format or you want to extract more fields, do transformations on the data, or assign datasources, you can build your own parser.
The following diagram provides an overview of where parsers fit in the configuration flow to ingest data using LogScale.
Figure 41. Flow
In this guide we will go through the steps of creating a parser from scratch.
![]() |
Figure 42. Creating a Parser
Creating a New Parser
Security Requirements and Controls
Change parsers
permission
Go to the Repository and Views page.
Select a Repository.
Click Parsers and then click
Insert a name for you parser (only alphanumeric characters, underscore and hyphen are allowed), the name is important as it is used by the API to uniquely identify the parser.
Select how to create the Parser:
Empty Parser – Select Empty Parser and click .
Clone Existing – Select Clone Existing choose a parser from the drop-down menu and click .
From Template – Select From Template browse for or drag and drop a parser and click .
From Package – Select From Package and click .
Writing a Parser
Once you have created your parser, you will be presented with a code editor.
![]() |
Figure 43. Writing a Parser
Parser Editor - a simple parser and two test cases.
The programming language used for creating a parser is the same as you
use to write queries on the search page. The main difference between
writing a parser and writing a search query is that you cannot use
aggregate functions like groupBy()
, as the parser
acts on one event at a time.
The input data is usually log lines or JSON objects, but could be any text format like a stack trace or CSV.
When sending data to LogScale, the text string for the input is put in the field @rawstring. Depending on how data is shipped to LogScale, other fields can be set as well. For example when sending data with Filebeat, the fields @host and @source will also be set. And it is possible to add more fields using the Filebeat.
Using the Parser Code Editor
The editor allows you to create and edit parsers code and run test for your parsers.
To access the editor go to create a new parser. The code editor is displayed.
and select an existing parser from the list or click toWrite the script for your parser or edit an existing parser in the Parser script area, see the following for examples:
Click
to save your changes.Optionally, you can export, duplicate or add a test.
Creating an Event from Incoming Data
The parser converts the data in @rawstring into an event. That means the parser should:
Assign the special @timestamp and @timezone fields.
Extract additional fields that should be stored along with your event.
Let's take a look at a couple of parsers to understand how they work.
Example: Parsing Log Lines
Assume we have a system producing logs like the following two lines:
2018-10-15T12:51:40+00:00 [INFO] This is an example log entry. id=123 fruit=banana
2018-10-15T12:52:42+01:30 [ERROR] Here is an error log entry. class=c.o.StringUtil fruit=pineapple
We want the parser to produce two events (one per line) and use the timestamp of each line as the time at which the event occurred; that is, assign it to the field @timestamp and @timezone.
To do this we could write a parser. Create field ts
by extracting the first part of each log line using a regular
expression. See regex()
. The syntax
?<ts> is called a named group. It means
whatever is matched will produce a field with that name — in this
case a field named ts.
/^(?<ts>\S+)/
|
parseTimestamp("yyyy-MM-dd'T'HH:mm:ss[.SSS]XXX", field=ts)
To set the timestamp for the event, use the function
parseTimestamp()
. It uses the field
ts we just extracted and parses the string value
into a timestamp. It sets the timestamp for the event by setting the
field @timestamp. Note the timezone is also parsed
and set using the field @timezone.
This parser assigns the @timestamp and @timezone fields, which is the minimum you can do to create events from the examples above. At this point we have a fully valid parser.
The two log lines contain more useful information, like the
INFO
and ERROR
log levels. We can extract
those by extending the regular expression:
//first the timestamp is extracted. Then the regex matches the loglevel. For example [INFO] or [ERROR]
/^(?<ts>\S+) \[(?<loglevel>[^\]]+)\]/
|
@timestamp := parseTimestamp("yyyy-MM-dd'T'HH:mm:ss[.SSS]XXX", field=ts)
|
// The next line finds key value pairs and creates a field for each
kvParse()
The events will now have a field called loglevel.
At the bottom of the parser we also added the function
kvParse()
. This function will look for key-value
pairs in the log line and extract them into fields, like
id=123 and fruit=banana.
Parsing JSON
We've seen how to create a parser for unstructured log lines. Now let's create a parser for JSON logs based on the following example input:
{
"ts": 1539602562000,
"message": "An error occurred.",
"host": "webserver-1"
}
{
"ts": 1539602572100,
"message": "User logged in.",
"username": "sleepy",
"host": "webserver-1"
}
Each object is a separate event and will be parsed separately, as with unstructured logs.
The JSON is accessible as a string in the field
@rawstring. We can extract fields from the JSON by
using the parseJson()
function. It takes a field
containing a JSON string (in this case @rawstring)
and extracts fields automatically, like this:
parseJson(field=@rawstring)
|
@timestamp := ts
|
@timezone := "Z"
This will result in events with a field for each property in the input
JSON, like username and host,
and will use the value of ts as the timestamp. If
the timestamp is a string it can be parsed using the
parseTimestamp()
function.
Named Capture Groups
LogScale extracts fields using named capture groups
—
a feature of regular expressions that allows you to name sub-matches,
for example:
/(?<firstname>\S+)\s(?<lastname>\S+)/
This defines a regex that expects the input to contain a first name and a last name. It then extracts the names into two fields firstname and lastname. The \S means any character that is not a whitespace and \s is any whitespace character.
Next Steps
Once you have your parser script created you can start using it by Ingest Tokens.
You can also learn about how parsers can help speed up queries by Event Tags.