Parsing Timestamps
In LogScale, the time at which an event occurred is stored in the field @timestamp. Everything (be it logs or metrics) must have a @timestamp and if one is not assigned by the parser, LogScale will automatically assign the current system time to @timestamp.
All events automatically include the @ingesttimestamp field, which identifies the timestamp when the line was parsed and added to LogScale. Due to the latency between production of the original log, shipping of the log to LogScale, and then ingesting, there will be small differences between the @ingesttimestamp and @timestamp values.
Methods
The most important job for a parser is to assign timestamps to events. The @timestamp field must be formatted as Unix Time in milliseconds, for example 1542400149000 for 11/16/2018 @ 8:29pm (UTC).
The problem is that most incoming data will contain the timestamp in some other, more human-readable format, such as ISO 8601.
That is why most parsers will take a formatted timestamp from the input
event and convert it to Unix Time using either the
parseTimestamp()
function, when the format is
known, or the findTimestamp()
function, for
guessing the timestamp.
The parseTimestamp()
function parses a timestamps
in a field and with a given format. It is useful if the timestamp format
and placement in the log line are known, and are the same for all log
lines that are parsed with the same parser.
The findTimestamp()
function tries to find a
timestamp in the first part of the log line using heuristics. It is
useful if the timestamp format or placement in the log line are not
known, or the format or placement are not the same for all log lines
that are parsed with the same parser.
Here is an example parser using parseTimestamp()
for the following JSON data:
{ "ts": "11/16/2018 @ 8:29pm (UTC)", "eventType": "login", "username": "monkey" }
parseJson()
| parseTimestamp("MM/dd/yyyy @ h:mma (z)", field=ts)
Getting the timestamp format exactly right for
parseTimestamp()
is important and can be difficult.
For details on how to define the timestamp format, see
Java's
DateTimeFormatter documentation. Alternatively, the
findTimestamp()
function can be used.
The default timestamp format used by
parseTimestamp()
is
ISO
8601, or yyyy-MM-dd'T'HH:mm:ss[.SSS]XXX.
Here is an example parser using findTimestamp()
:
kvParse()
| findTimestamp()
Note
The order of the two functions in this query does not matter.
Sub-Millisecond Precision of Timestamps
If your events have timestamps with precision better than milliseconds
then you should set the field @timestamp.nanos to
the number of nanoseconds within the millisecond, thus a value in the
range [ 0 ; 999,999 ]
. Both of the
parseTimestamp()
and
findTimestamp()
functions do that for you if the
format applied includes sub-millisecond precision that match the input
in the event.
The format()
function allows including the extra
precision when formatting the timestamp for display if specified in
the format string.
Timezones
Since LogScale stores timestamps in Unix Time, the timezone present on the input (if any) is stored in a separate field called @timezone.
In most cases this field will have the value Z (UTC).
Dealing with a missing timezone
For the parseTimestamp()
function, either the
timestamp format must contain a timezone, or a timezone must be passed
as an argument.
You can do this by specifying the
timezone
parameter
to the parseTimestamp()
function. For example,
consider this input event:
{ "ts": "2018/11/01 14:31:10", "server": "web01", "message": "Out of memory" }
You'll notice that there is no zone information in the timestamp. We can set it explicitly like so:
parseJson()
| parseTimestamp("yyyy/MM/dd HH:mm:ss", timezone="Europe/Paris", field=ts)
See the query function documentation for
parseTimestamp()
for more options for setting the
timezone.
It is also possible to give a
timezone
parameter
to the findTimestamp()
function, which will be
used for timestamps that do not contain timezone information like so:
kvParse()
| findTimestamp(timezone="America/New_York")
If no timezone is supplied, timestamps without a timezone will not be parsed.