Transpose a (table-like) query result by creating an event (row) for each
column (attribute name), in which attributes are named
row[1]
,
row[2]
, etc.
Parameter | Type | Required | Default Value | Description |
---|---|---|---|---|
column | string | optional[a] | column | Field to use as column value. |
header | string | optional[a] | Use this field as header value. | |
limit | integer | optional[a] | 5 | Maximum number of rows to transpose. |
Minimum | 1 | |||
Maximum | 1,000 | |||
pivot [b] | string | optional[a] | Use this field as header AND column value. | |
[a] Optional parameters use their default value unless explicitly set. |
Hide omitted argument names for this function
Omitted Argument NamesThe argument name for
pivot
can be omitted; the following forms of this function are equivalent:logscaletranspose("value")
and:
logscaletranspose(pivot="value")
These examples show basic structure only.
For example, given a query that returns a table, such as:
groupby(loglevel)
loglevel | _count |
---|---|
ERROR | 2 |
WARN | 400 |
INFO | 200 |
The result can be transposed to:
groupby(loglevel)
| transpose()
column | row[1] | row[2] | row[3] |
---|---|---|---|
_count | 2 | 400 | 200 |
loglevel | ERROR | WARN | INFO |
To use the loglevel row as the header, use:
...
| transpose(header=loglevel)
column | ERROR | WARN | INFO |
---|---|---|---|
_count | 2 | 400 | 200 |
transpose()
Examples
Create a Pivot Table
Creating a view of LogScale activity
Query
groupBy([type,actor.user.id],function={groupBy(actor.user.id, function=max(@timestamp))})
|transpose(header=type)
|drop(column)
Introduction
The humio-audit repository contains audit events for the LogScale cluster. Reporting on this information can provide a wealth of information, but a useful summary can be created based on the activities, users and which the latest user of that particular operation.
Step-by-Step
Starting with the source repository events.
- logscale
groupBy([type,actor.user.id],function={groupBy(actor.user.id, function=max(@timestamp))})
The first step to creating a pivot table is the base query that will create the initial summary of the information. In this fragment, a nested
groupBy()
aggregation. The embedded aggregation creates a group of the maximum access time for a given user, by usingmax()
on the @timestamp against the actor.user.id. This creates a table of the last event by the user. The outergroupBy()
then creates an aggregation of this maximum user time against the type which defines the operation performed.The result is a table of the last user and time for a specific operation; for example, the last time a query was executed. An example of this table can be seen below:
type actor.user.id _max alert.clear-error
0O7WGPBX9YbvZbKOrBMd5fgH
1700546666592 alert.create
0O7WGPBX9YbvZbKOrBMd5fgH
1699004139419 alert.update
0O7WGPBX9YbvZbKOrBMd5fgH
1700546666676 dashboard.create
0O7WGPBX9YbvZbKOrBMd5fgH
1698417330709 dataspace.query
0O7WGPBX9YbvZbKOrBMd5fgH
1700721296197 - logscale
|transpose(header=type)
The
transpose()
will convert individual columns into rows, switching the orientation. For example, the type column will now become the type row. However, there are no row titles, so the title for the resulting table will by default create a header row containing the column and row numbers, like this:column row[1] row[2] row[3] row[4] row[5] _max 1700546666592 1699004139419 1700546666676 1698417330709 1700722209214 actor.user.id 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH type alert.clear-error alert.create alert.update dashboard.create dataspace.query However, the aggregate grouping, type could be used instead as a valid header for each column. To achieve that, use the
header
parameter to specify type as the column. The resulting table now looks like this:alert.clear-error alert.create alert.update column dashboard.create dataspace.query 1700546666592 1699004139419 1700546666676 _max 1698417330709 1700722210073 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH actor.user.id 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH - logscale
|drop(column)
The table created contains the summarized information pivoted around the user ID and last event time further summarized by the type of the event. However, there is a column in the table, column, which is now a field in the event stream that was generated from the old row before the table was pivoted.
That column can be removed by dropping the column field from the event using
drop()
to remove the column from the events. Event Result set.
Summary and Results
Pivoting an event set of data allows for the information to be displayed and summarized in a format that may make more logical sense as a display format. The final table will look like this:
alert.clear-error | alert.create | alert.update | dashboard.create | dataspace.query |
---|---|---|---|---|
1700546666592 | 1699004139419 | 1700546666676 | 1698417330709 | 1700722210073 |
0O7WGPBX9YbvZbKOrBMd5fgH | 0O7WGPBX9YbvZbKOrBMd5fgH | 0O7WGPBX9YbvZbKOrBMd5fgH | 0O7WGPBX9YbvZbKOrBMd5fgH | 0O7WGPBX9YbvZbKOrBMd5fgH |
Deduplicate Compound Field Data
Query
splitString(field=userAgent,by=" ",as=agents)
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
|array:union(array=agents,as=browsers)
|transpose()
|drop(column)
|groupBy(row[1])
Introduction
Deduplicating fields of information where there are multiple
occurrences of a value in a single field, maybe separated by a
single character can be achieved in a variety of ways. This
solution makes use of array:union()
,
transpose()
and
groupBy()
to collate and aggregate the
information into the final list of values.
For example, when examining the humio and looking
for the browsers or user agents that have used your instance,
the UserAgent
data will
contain the browser and toolkits used to support them, for
example:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
The actual names are the
Name/Version
pairs showing
compatibility with different browser standards. Resolving this
into a simplified list requires splitting up the list,
simplifying (to remove duplicates), filtering, and then
summarizing the final list.
The process we need to follow is first extract the information into an array of values that we can then simplify and aggregate. It's possible the information in your raw data is already stored in an array of information that needs to be summarised.
Step-by-Step
Starting with the source repository events.
- logscale
splitString(field=userAgent,by=" ",as=agents)
Splits up the userAgent field using a call to
splitString()
and places the output into the array field agentsThis will create individual array entries into the agents array for each event:
agents[0]="Mozilla/5.0" agents[1]="(Macintosh;" agents[2]="Intel" agents[3]="Mac" agents[4]="OS" agents[5]="X" agents[6]="10_15_7)" agents[7]="AppleWebKit/537.36" agents[8]="(KHTML," agents[9]="like" agents[10]="Gecko)" agents[11]="Chrome/116.0.0.0" agents[12]="Safari/537.36" - logscale
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
- logscale
|array:union(array=agents,as=browsers)
Using
array:union()
we aggregate the list of user agents across all the events to create a list of unique entries. This will eliminate duplicates where the value of the user agent is the same value. - logscale
|transpose()
Using the
transpose()
function, we transpose the rows and columns for the list of matching field names and values, turning:browsers[0] browsers[1] browsers[2] Gecko/20100101 Safari/537.36 AppleWebKit/605.1.15 into:
column row[1] browsers[0] Gecko/20100101 browsers[1] Safari/537.36 browsers[2] AppleWebKit/605.1.15 - logscale
|drop(column)
We do not need the column information, just the unique list of browsers in the row[1] column, so we drop the field from the event list.
- logscale
|groupBy(row[1])
Now we can aggregate the list, by the remaining field row[1] to provide the unique list of potential values. This is not a count of the times each has occurred, but a unique list of all the different possible values. The resulting list looks like this:
Event Result set.
Summary and Results
The resulting output from the query is a summarized list of the unique possible values from the original source fields, even though the source information was originally contained within a single field in the source events.
row[1] | _count |
---|---|
AppleWebKit/537.36 | 1 |
AppleWebKit/605.1.15 | 1 |
CFNetwork/1410.0.3 | 1 |
Chrome/116.0.0.0 | 1 |
Darwin/22.6.0 | 1 |
Firefox/116.0 | 1 |
Gecko/20100101 | 1 |
Mobile/15E148 | 1 |
Mozilla/5.0 | 1 |
Safari/18615.3.12.11.2 | 1 |
Safari/537.36 | 1 |
Safari/604.1 | 1 |
Safari/605.1.15 | 1 |
Version/16.4 | 1 |
Version/16.6 | 1 |
Transpose a Basic Table
Transposing an event set effectively switches rows (each event) into columns (an event)
Query
groupby([loglevel)
| transpose(header=loglevel)
| drop(column)
Introduction
By transposing event set, the information can be viewed and summarized in a more human readable form. Transposing also allows for aggregated data to be viewed in a form where the value of an aggregated field becomes the columns. This can be used to summarize the information further by showing multiple rows of data. For example, in the humio repository the event information contains a list of errors, warnings, or informational events for activity within the cluster.
Step-by-Step
Starting with the source repository events.
- logscale
groupby([loglevel)
Aggregates the lows by the loglevel. This field will either be
WARN
,ERROR
orINFO
depending on the level of the event log entry. The default function is used, which results in a count in the number of times an event of this type has been seen:loglevel _count ERROR
27424 INFO
18840156 WARN
2059898 In this output, each event is a row in the table, each with two fields, loglevel and _count.
- logscale
| transpose(header=loglevel)
Transposing the events within the
transpose()
will make each field in each event a row in the new stream of events, for example, the loglevel field with the valueERROR
will become the field ERROR, swapping the rows for columns. By using theheader
parameter,transpose()
uses the value of the aggregated field as the fieldname. The output will now be a table with a column (field) for each value, and a single row with the count:ERROR INFO WARN column 97159 63719404 5716733 _count - logscale
| drop(column)
In the final output, the column field in the events is the one generated from the field names of the original events and it's not needed, so it can be removed by using the
drop()
function to remove the field from the event set. Event Result set.
Summary and Results
The transpose()
is a great way of
reorganizing data into a format is either more readable, or
easily applied to other display formats as part of a widget. The
final table looks like this:
ERROR | INFO | WARN |
---|---|---|
97159 | 63719404 | 5716733 |
However, the information as it is now formatted can more easily be applied to a variety of visualizations. For example, the data can be formatted as a bar chart, as we now have a list of fields and a value:
The difference is that without transpose()
,
the aggregate result set is a list of events with a field name
and value. With transpose()
, the result set
is a single event with multiple fields, and this is interpreted
by the bar chart as a series of values.