Transpose a (table-like) query result by creating an event (row) for each column (attribute name), in which attributes are named row[1], row[2], etc.

ParameterTypeRequiredDefault ValueDescription
columnstringoptional[a]column Field to use as column value.
headerstringoptional[a]  Use this field as header value.
limitintegeroptional[a]5 Maximum number of rows to transpose.
  Minimum1 
  Maximum1,000 
pivot[b]stringoptional[a]  Use this field as header AND column value.

[a] Optional parameters use their default value unless explicitly set.

[b] The argument name pivot can be omitted.

Hide omitted argument names for this function

Show omitted argument names for this function

For example, given a query that returns a table, such as:

logscale
groupby(loglevel)
loglevel _count
ERROR 2
WARN 400
INFO 200

The result can be transposed to:

logscale
groupby(loglevel)
| transpose()
column row[1] row[2] row[3]
_count 2 400 200
loglevel ERROR WARN INFO

To use the loglevel row as the header, use:

logscale
...
| transpose(header=loglevel)
column ERROR WARN INFO
_count 2 400 200

transpose() Examples

Create a Pivot Table

Creating a view of LogScale activity

Query
logscale
groupBy([type,actor.user.id],function={groupBy(actor.user.id, function=max(@timestamp))})
|transpose(header=type)
|drop(column)
Introduction

The humio-audit repository contains audit events for the LogScale cluster. Reporting on this information can provide a wealth of information, but a useful summary can be created based on the activities, users and which the latest user of that particular operation.

Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    groupBy([type,actor.user.id],function={groupBy(actor.user.id, function=max(@timestamp))})

    The first step to creating a pivot table is the base query that will create the initial summary of the information. In this fragment, a nested groupBy() aggregation. The embedded aggregation creates a group of the maximum access time for a given user, by using max() on the @timestamp against the actor.user.id. This creates a table of the last event by the user. The outer groupBy() then creates an aggregation of this maximum user time against the type which defines the operation performed.

    The result is a table of the last user and time for a specific operation; for example, the last time a query was executed. An example of this table can be seen below:

    type actor.user.id _max
    alert.clear-error 0O7WGPBX9YbvZbKOrBMd5fgH 1700546666592
    alert.create 0O7WGPBX9YbvZbKOrBMd5fgH 1699004139419
    alert.update 0O7WGPBX9YbvZbKOrBMd5fgH 1700546666676
    dashboard.create 0O7WGPBX9YbvZbKOrBMd5fgH 1698417330709
    dataspace.query 0O7WGPBX9YbvZbKOrBMd5fgH 1700721296197
  3. logscale
    |transpose(header=type)

    The transpose() will convert individual columns into rows, switching the orientation. For example, the type column will now become the type row. However, there are no row titles, so the title for the resulting table will by default create a header row containing the column and row numbers, like this:

    column row[1] row[2] row[3] row[4] row[5]
    _max 1700546666592 1699004139419 1700546666676 1698417330709 1700722209214
    actor.user.id 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH
    type alert.clear-error alert.create alert.update dashboard.create dataspace.query

    However, the aggregate grouping, type could be used instead as a valid header for each column. To achieve that, use the header parameter to specify type as the column. The resulting table now looks like this:

    alert.clear-error alert.create alert.update column dashboard.create dataspace.query
    1700546666592 1699004139419 1700546666676 _max 1698417330709 1700722210073
    0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH actor.user.id 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH
  4. logscale
    |drop(column)

    The table created contains the summarized information pivoted around the user ID and last event time further summarized by the type of the event. However, there is a column in the table, column, which is now a field in the event stream that was generated from the old row before the table was pivoted.

    That column can be removed by dropping the column field from the event using drop() to remove the column from the events.

  5. Event Result set.

Summary and Results

Pivoting an event set of data allows for the information to be displayed and summarized in a format that may make more logical sense as a display format. The final table will look like this:

alert.clear-error alert.create alert.update dashboard.create dataspace.query
1700546666592 1699004139419 1700546666676 1698417330709 1700722210073
0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH 0O7WGPBX9YbvZbKOrBMd5fgH

Deduplicate Compound Field Data

Query
logscale
splitString(field=userAgent,by=" ",as=agents)
  
|array:filter(array="agents[]", function={bname=/\//}, var="bname")
     
|array:union(array=agents,as=browsers)
|transpose()
|drop(column)
|groupBy(row[1])
Introduction

Deduplicating fields of information where there are multiple occurrences of a value in a single field, maybe separated by a single character can be achieved in a variety of ways. This solution makes use of array:union(), transpose() and groupBy() to collate and aggregate the information into the final list of values.

For example, when examining the humio and looking for the browsers or user agents that have used your instance, the UserAgent data will contain the browser and toolkits used to support them, for example:

syslog
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36

The actual names are the Name/Version pairs showing compatibility with different browser standards. Resolving this into a simplified list requires splitting up the list, simplifying (to remove duplicates), filtering, and then summarizing the final list.

The process we need to follow is first extract the information into an array of values that we can then simplify and aggregate. It's possible the information in your raw data is already stored in an array of information that needs to be summarised.

Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    splitString(field=userAgent,by=" ",as=agents)

    Splits up the userAgent field using a call to splitString() and places the output into the array field agents

    This will create individual array entries into the agents array for each event:

    agents[0]="Mozilla/5.0"
    agents[1]="(Macintosh;"
    agents[2]="Intel"
    agents[3]="Mac"
    agents[4]="OS"
    agents[5]="X"
    agents[6]="10_15_7)"
    agents[7]="AppleWebKit/537.36"
    agents[8]="(KHTML,"
    agents[9]="like"
    agents[10]="Gecko)"
    agents[11]="Chrome/116.0.0.0"
    agents[12]="Safari/537.36"
  3. logscale
    |array:filter(array="agents[]", function={bname=/\//}, var="bname")

  4. logscale
    |array:union(array=agents,as=browsers)

    Using array:union() we aggregate the list of user agents across all the events to create a list of unique entries. This will eliminate duplicates where the value of the user agent is the same value.

  5. logscale
    |transpose()

    Using the transpose() function, we transpose the rows and columns for the list of matching field names and values, turning:

    browsers[0]browsers[1]browsers[2]
    Gecko/20100101Safari/537.36AppleWebKit/605.1.15

    into:

    columnrow[1]
    browsers[0]Gecko/20100101
    browsers[1]Safari/537.36
    browsers[2]AppleWebKit/605.1.15
  6. logscale
    |drop(column)

    We do not need the column information, just the unique list of browsers in the row[1] column, so we drop the field from the event list.

  7. logscale
    |groupBy(row[1])

    Now we can aggregate the list, by the remaining field row[1] to provide the unique list of potential values. This is not a count of the times each has occurred, but a unique list of all the different possible values. The resulting list looks like this:

  8. Event Result set.

Summary and Results

The resulting output from the query is a summarized list of the unique possible values from the original source fields, even though the source information was originally contained within a single field in the source events.

row[1]_count
AppleWebKit/537.361
AppleWebKit/605.1.151
CFNetwork/1410.0.31
Chrome/116.0.0.01
Darwin/22.6.01
Firefox/116.01
Gecko/201001011
Mobile/15E1481
Mozilla/5.01
Safari/18615.3.12.11.21
Safari/537.361
Safari/604.11
Safari/605.1.151
Version/16.41
Version/16.61

Transpose a Basic Table

Transposing an event set effectively switches rows (each event) into columns (an event)

Query
logscale
groupby([loglevel)
| transpose(header=loglevel)
| drop(column)
Introduction

By transposing event set, the information can be viewed and summarized in a more human readable form. Transposing also allows for aggregated data to be viewed in a form where the value of an aggregated field becomes the columns. This can be used to summarize the information further by showing multiple rows of data. For example, in the humio repository the event information contains a list of errors, warnings, or informational events for activity within the cluster.

Step-by-Step
  1. Starting with the source repository events.

  2. logscale
    groupby([loglevel)

    Aggregates the lows by the loglevel. This field will either be WARN, ERROR or INFO depending on the level of the event log entry. The default function is used, which results in a count in the number of times an event of this type has been seen:

    loglevel _count
    ERROR 27424
    INFO 18840156
    WARN 2059898

    In this output, each event is a row in the table, each with two fields, loglevel and _count.

  3. logscale
    | transpose(header=loglevel)

    Transposing the events within the transpose() will make each field in each event a row in the new stream of events, for example, the loglevel field with the value ERROR will become the field ERROR, swapping the rows for columns. By using the header parameter, transpose() uses the value of the aggregated field as the fieldname. The output will now be a table with a column (field) for each value, and a single row with the count:

    ERROR INFO WARN column
    97159 63719404 5716733 _count
  4. logscale
    | drop(column)

    In the final output, the column field in the events is the one generated from the field names of the original events and it's not needed, so it can be removed by using the drop() function to remove the field from the event set.

  5. Event Result set.

Summary and Results

The transpose() is a great way of reorganizing data into a format is either more readable, or easily applied to other display formats as part of a widget. The final table looks like this:

ERROR INFO WARN
97159 63719404 5716733

However, the information as it is now formatted can more easily be applied to a variety of visualizations. For example, the data can be formatted as a bar chart, as we now have a list of fields and a value:

The difference is that without transpose(), the aggregate result set is a list of events with a field name and value. With transpose(), the result set is a single event with multiple fields, and this is interpreted by the bar chart as a series of values.