SHA-256 Hash Multiple Fields

SHA-256 hash multiple fields using the crypto:sha256() function

Query

logscale
crypto:sha256(field=[a,b,c])

Introduction

In LogScale it is possible to encode strings using different algorithms such as MD5, SHA-1 and SHA-256 and create a hash; also called a fingerprint. The MD5 hash function is the weakest of the three, whereas SHA-256 is the strongest. The crypto:sha256() function is used to create the SHA-256 hash by taking a string of any length and encoding it into a 256-bit fingerprint. The fingerprint is returned as hexadecimal characters. Encoding the same string using the SHA-256 algorithm will always result in the same 256-bit hash output (64 hexadecimal digits).

In this example, the crypto:sha256() function is used to hash the fields a,b,c and return the result into a field named _sha256.

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0>Augment Data] result{{Result Set}} repo --> 0 0 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    crypto:sha256(field=[a,b,c])

    Performs a cryptographic SHA256-hashing of a,b,c. The field argument can be omitted to write: crypto:sha1([a,b,c])

  3. Event Result set.

Summary and Results

The query is used to encode a string using the SHA-256 hash. When called with multiple values, crypto:sha256() function creates a single SHA-256 sum from the combined value of the supplied fields. Combining fields in this way and converting to an SHa-256 can be an effective method of creating a unique ID for a given fieldset which could be used to identify a specific event type. The SHA-256 is reproducible (for example, supplying the same values will produce the same SHA-256 sum), and so it can sometimes be an effective method of creating unique identifier or lookup fields for a join() across two different datasets.