Using the selfJoin() Function

The selfJoin() function doesn't perform a join in the traditional sense, but is designed to collate information from a repository within a query where they share a command field.

selfJoin() executes two queries, the primary query and sub query, and joins on a commaon field (specified by the field parameter) between the two queries in a similar fashion to join().

For example, given input data the includes the sourvce and destination address for connections and a common connection ID:

Raw Events
from=192.168.1.15 to=10.0.0.22 connid=b36ecce120f181c6e79560c3160dd950
from=192.168.1.11 to=10.0.0.0 connid=532d011f0a5bf80e9b346e9e0dccbd0b
from=192.168.1.6 to=10.0.0.9 connid=d8b674841c6fc75f24572ccd6c9e92ea
from=192.168.1.20 to=10.0.0.11 connid=7fb64aeef6f08fd2f2724c07540adb02
from=192.168.1.14 to=10.0.0.7 connid=d8b674841c6fc75f24572ccd6c9e92ea
from=192.168.1.230 to=10.0.0.203 connid=2057644393c7ff4636d6911bfb674729
from=192.168.1.3 to=10.0.0.17 connid=1644bdaa1cad36b146955e09e0ac0319
from=192.168.1.11 to=10.0.0.13 connid=cd244fc908fb252f9a8ee5e1eb4f4d5c
from=192.168.1.22 to=10.0.0.5 connid=7b5ff133f72d6d5772f72d5db9fc5978
from=192.168.1.21 to=10.0.0.16 connid=fa59f655637787487486ab6a39c6c76e
from=192.168.1.15 to=10.0.0.0 connid=4f69ae37daf6f0f3c12cee4f0349201a
from=192.168.1.24 to=10.0.0.22 connid=b4454f0cca4d4d57f808bdc97db0cb4f
from=192.168.1.10 to=10.0.0.22 connid=904f87cd6281192c8ce4baacee1bf0c2
from=192.168.1.12 to=10.0.0.20 connid=305bb06e3a1c58a1806136b1de4a659b

The following query will summarize the information for source and destination across the connection ID:

logscale
selfJoin(field=connid,where=[{from="192.168.1.7"},{to="10.0.0.12"}])

This would output the following result:

connid
262565314144352e22375b816abbcd3c
55d29bf268390a96fc6891f8f5f3fb35
7b5ff133f72d6d5772f72d5db9fc5978
9ea512baa6655f0e27f247650f36aca5
bbad6ec69d8cc416aaefc229452586a3
cd244fc908fb252f9a8ee5e1eb4f4d5c

This process can be more efficient than using a groupBy() for larger datasets.