FAQ: How do I do a join() statement?

The join() function is generally used when you have two query results that you'd like to combine, and both results share a common value. There is also a selfJoin() and selfJoinFilter() function for certain situations, both described in the official documentation. Please note the field names can be different: it's the value of those fields that should be identical.

Before we provide the example, some background:

We want to combine ProcessRollup2 and NetworkListenIP4 to find processes that have created a listener on a port.
ProcessRollup2 contains the process information. The field we're using in the results is TargetProcessId to designate the ID of the process.
NetworkListenIP4 contains the listener information. Instead of TargetProcessId, it uses ContextProcessId. This also designates the ID of the process.
Both TargetProcessId and ContextProcessId share a common value, e.g. 123456 would represent the same process.

And the join() to combine them:

#event_simpleName=ProcessRollup2
| join({#event_simpleName=NetworkListenIP4 LocalPort>1024 LocalPort!=0}, field=TargetProcessId, key=ContextProcessId, include=[LocalAddressIP4, LocalPort])

But what does that query actually mean?

The first query should be the larger of the two. In this case, #event_simpleName=ProcessRollup2 is the larger query.
The query within join() should be the smaller of the two queries, and is known as the subquery. It's the query within the { ... } after the join() function. The query is simple enough: #event_simpleName=NetworkListenIP4 LocalPort>1024 LocalPort!=0.
The field= and key= are the two fields that share a common value.
field= is tied to the larger query, i.e. #event_simpleName=ProcessRollup2. This is the field TargetProcessId in ProcessRollup2 events.
key= is tied to the smaller query, i.e. #event_simpleName=NetworkListenIP4 LocalPort>1024 LocalPort!=0. This is the field ContextProcessId in NetworkListenIP4 events.
The include=[ ... ] is a list of fields that should be pulled from the smaller query into the results.

And a visualization of everything that was just described:

Another example:

We want to find the UserName from a UserSid in a ProcessRollup2 event.
The ProcessRollup2 events only have a UserSid.
The UserIdentity events have a UserName and a UserSid.
The common field between them is UserSid.

Easy! We just join them together:

// Get the PR2 events.
#event_simpleName=ProcessRollup2
// Only grab the ones with a UserCid.
| UserSid=*
// Join the two together based on the UserSid.
| join({#event_simpleName=UserIdentity}, field=UserSid, include=[UserName], limit=100000)

Knowledge Base

FAQ: How do I do a join() statement?

Enter search term