Hash a Field Using Different Seeds

Generate hash values using the hash() function with different seeds

Query

flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1["Expression"] result{{Result Set}} repo --> 1 1 --> result
logscale
| hash_seed10 := hash(field=[username], seed=10)
| hash_seed20 := hash(field=[username], seed=20)

Introduction

The hash() function can be used to generate deterministic hash values from field contents. The seed parameter acts as an initialization value for the hashing algorithm, allowing you to generate different but consistent hash values for the same input.

In this example, the hash() function is used to demonstrate how different seed values affect the hash output while maintaining consistency for the same input values.

Example incoming data might look like this:

@timestampusernameaction
2025-08-27T08:51:51.312Zalicelogin
2025-08-27T09:15:22.445Zboblogin
2025-08-27T10:30:15.891Zalicelogout
2025-08-27T11:45:33.167Zcharlielogin
2025-08-27T12:20:44.723Zboblogout

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 1["Expression"] result{{Result Set}} repo --> 1 1 --> result style 1 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    | hash_seed10 := hash(field=[username], seed=10)
    | hash_seed20 := hash(field=[username], seed=20)

    Creates two new fields with different hash values for the same input:

    • Field hash_seed10 contains hash values generated with seed=10

    • Field hash_seed20 contains hash values generated with seed=20

    The field parameter specifies username as the input field in an array format. The seed parameter initializes the hashing algorithm with different values, producing different but consistent hash patterns.

  3. Event Result set.

Summary and Results

The query is used to demonstrate how different seed values affect hash generation while maintaining consistency for identical inputs.

This query is useful, for example, to create multiple different pseudonymous identifiers for the same data, compare hash distributions with different seeds, or understand how seed values affect hash generation

Sample output from the incoming example data:

usernameactionhash_seed10hash_seed20
alicelogin72349810736145328918945672301234567890
boblogin41235678901234567895678901234567890123
alicelogout72349810736145328918945672301234567890
charlielogin98765432109876543212345678901234567890
boblogout41235678901234567895678901234567890123

Note that the same username produces different hash values with different seeds (compare hash_seed10 and hash_seed20 for alice). Each seed consistently produces the same hash value for the same input (notice how alice always has the same hash value within each seed).