SHA-256 Hash Multiple Fields
SHA-256 hash multiple fields using the crypto:sha256()
function
Query
crypto:sha256(field=[a,b,c])
Introduction
In LogScale it is possible to encode strings using different
algorithms such as MD5, SHA-1 and SHA-256 and create a hash; also called
a fingerprint. The MD5 hash function is the weakest of the three,
whereas SHA-256 is the strongest. The
crypto:sha256()
function is used to create the
SHA-256 hash by taking a string of any length and encoding it into a
256-bit fingerprint. The fingerprint is returned as hexadecimal
characters. Encoding the same string using the SHA-256 algorithm will
always result in the same 256-bit hash output (64 hexadecimal digits).
In this example, the crypto:sha256()
function is
used to hash the fields a,b,c
and return the result into a field named
_sha256
.
Step-by-Step
Starting with the source repository events.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0>Augment Data] result{{Result Set}} repo --> 0 0 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
crypto:sha256(field=[a,b,c])
Performs a cryptographic SHA256-hashing of a,b,c. The
field
argument can be omitted to write:crypto:sha1([a,b,c])
Event Result set.
Summary and Results
The query is used to encode a string using the SHA-256 hash. When called
with multiple values, crypto:sha256()
function
creates a single SHA-256 sum from the combined value of the supplied
fields. Combining fields in this way and converting to an SHa-256 can be
an effective method of creating a unique ID for a given fieldset which
could be used to identify a specific event type. The SHA-256 is
reproducible (for example, supplying the same values will produce the same
SHA-256 sum), and so it can sometimes be an effective method of creating
unique identifier or lookup fields for a join()
across two different datasets.