Parsing Timestamps
In Humio, the time at which an event occurred is stored in the field
@timestamp
. Everything (be it logs or metrics) must
have a @timestamp
and if one is not assigned by the
parser, Humio will automatically assign the current system time to
@timestamp
.
All events automatically include the @ingesttimestamp
field, which identifies the timestamp when the line was parsed and added
to Humio. Due to the latency between production of the original log,
shipping of the log to Humio, and then ingesting, there will be small
differences between the @ingesttimestamp
and
@timestamp
values.
Methods
The most important job for a parser is to assign timestamps to events.
The @timestamp
field must be formatted as Unix Time
in milliseconds, for example 1542400149000
for
11/16/2018 @ 8:29pm (UTC)
.
The problem is that most incoming data will contain the timestamp in some other, more human-readable format, such as ISO 8601.
That is why most parsers will take a formatted timestamp from the input
event and convert it to Unix Time using either the
parseTimestamp()
function, when the format is
known, or the findTimestamp()
function, for
guessing the timestamp.
The parseTimestamp()
function parses a timestamps
in a field and with a given format. It is useful if the timestamp format
and placement in the log line are known, and are the same for all log
lines that are parsed with the same parser.
The findTimestamp()
function tries to find a
timestamp in the first part of the log line using heuristics. It is
useful if the timestamp format or placement in the log line are not
known, or the format or placement are not the same for all log lines
that are parsed with the same parser.
Here is an example parser using parseTimestamp()
for the following JSON data:
{ "ts": "11/16/2018 @ 8:29pm (UTC)", "eventType": "login", "username": "monkey" }
parseJson() | parseTimestamp("MM/dd/yyyy @ h:mma (z)", field=ts)
Getting the timestamp format exactly right for
parseTimestamp()
is important and can be difficult.
For details on how to define the timestamp format, see
Java's
DateTimeFormatter documentation. Alternatively, the
findTimestamp()
function can be used.
The default timestamp format used by
parseTimestamp()
is
ISO 8601, or
yyyy-MM-dd'T'HH:mm:ss[.SSS]XXX
.
Here is an example parser using findTimestamp()
:
kvParse() | findTimestamp()
Note that the order of the two functions does not matter.
Sub-Millisecond Precision of Timestamps
If your events have timestamps with precision better than milliseconds
then you should set the field @timestamp.nanos
to
the number of nanoseconds within the millisecond, thus a value in the
range [ 0 ; 999,999 ]
. Both of the
parseTimestamp()
and
findTimestamp()
functions do that for you if the
format applied includes sub-millisecond precision that match the input
in the event.
The format()
function allows including the extra
precision when formatting the timestamp for display if specified in
the format string.
Timezones
Since Humio stores timestamps in Unix Time, the timezone present on the
input (if any) is stored in a separate field called
@timezone
.
In most cases this field will have the value Z (UTC).
Dealing with a missing timezone
For the parseTimestamp()
function, either the
timestamp format must contain a timezone, or a timezone must be passed
as an argument.
You can do this by specifying the timezone
parameter to the parseTimestamp()
function. For
example, consider this input event:
{ "ts": "2018/11/01 14:31:10", "server": "web01", "message": "Out of memory" }
You'll notice that there is no zone information in the timestamp. We can set it explicitly like so:
parseJson() | parseTimestamp("yyyy/MM/dd HH:mm:ss", timezone="Europe/Paris", field=ts)
See the query function documentation for
parseTimestamp()
for more options for setting the
timezone.
It is also possible to give a timezone
parameter to
the findTimestamp()
function, which will be used
for timestamps that do not contain timezone information like so:
kvParse() | findTimestamp(timezone="America/New_York")
If no timezone is supplied, timestamps without a timezone will not be parsed.