SHA-1 Hash Multiple Fields
SHA-1 hash multiple fields using the crypto:sha1()
function
Query
crypto:sha1(field=[a,b,c])
Introduction
In LogScale it is possible to encode strings using different
algorithms such as MD5
,
SHA-1
, and
SHA-256
and create a hash; also
called a fingerprint. The MD5 hash function is the weakest of the three,
whereas SHA-256 is the strongest. The crypto:sha1()
function is used to create the SHA-1 hash by taking a string of any
length and encoding it into a 160-bit fingerprint. The fingerprint is
returned as hexadecimal characters. Encoding the same string using the
SHA-1 algorithm will always result in the same 160-bit hash output (40
hexadecimal digits).
In this example, the crypto:sha1()
function is used
to hash the fields a,b,c and
return the result into a field named
_sha1
.
Step-by-Step
Starting with the source repository events.
- flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0>Augment Data] result{{Result Set}} repo --> 0 0 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;logscale
crypto:sha1(field=[a,b,c])
Performs a cryptographic SHA1-hashing of a,b,c. The
field
argument can be omitted to write:crypto:sha1([a,b,c])
Event Result set.
Summary and Results
The query is used to encode a string using the SHA-1 hash. When called
with multiple values, crypto:sha1()
function creates
a single SHA-1 sum from the combined value of the supplied fields.
Combining fields in this way and converting to an SHa-1 can be an
effective method of creating a unique ID for a given fieldset which could
be used to identify a specific event type. The SHA-1 is reproducible (for
example, supplying the same values will produce the same SHA-1 sum), and
so it can sometimes be an effective method of creating unique identifier
or lookup fields for a join()
across two different
datasets.