Sample Queries

The Corelight data provides an ideal data set for learning how to query LogScale event data, and also extract information from Corelight event data for the purpose of identifying network and threat hunting data.

The following sections provide some guidance on how to search and extract information from the sample data set.

Note

Because the Corelight sample data is the same for all users, the example output shown for a given query will match when executed, provided that the same time range is selected. All the examples shown were executed over the entire data set.

Identifying Event Data

To start processing and identifying the individual events and what information can be extracted from the sample data, it is useful to understand the basic structure of the event data. One way to achieve this is to filter and summarise the information by the event types, protocols and session information.

Top Event Types

First, you can start by identifying the different types by getting the top events. This basic search can be a good way to identify the outliers in the overall event stream, for example, odd protocols, or protocols used less or more than you might expect.

The #path tag identifies the top level type:

logscale

top(#path, limit=100)

We increase the limit on the output so that we get the full set of different event types.

#path	_count
conn	626182
dns	309720
files	99111
http	65871
ssl	61076
ntp	25809
dhcp	25648
notice	20378
smartpcap-stats	19708
x509	15073
intel	11615
corelight_overall_capture_loss	9855
suricata_corelight	9245
weird	4971
dce_rpc	2012
specific_dns_tunnels	1712
smartpcap	1007
etc_viz	811
rdp	679
ssh	410
smb_mapping	379
kerberos	367
smtp	286
smtp_links	268
ntlm	208
smb_files	184
reporter	177
software	163
dpd	113
pe	103
snmp	95
ftp	39
meterpreter	10
meterpreter_headers	10
radius	9
stepping	7
dga	6
generic_icmp_tunnels	2
tunnel	2
log4j	1

The output highlights some specific protocols in the output that may warrant some further investigation.

Top Protocols

The sample data uses the service field to track specific protocols used in the events. Some attacks will use or overload a given protocol specification in order to initiate an attack, or they use invalid protocols to trigger a memory failure.

Let's see what output we get from this query:

logscale

top(service, limit=100)

The output generates a list of protocols:

service	_count
dns	184074
ssl	60934
http	48452
dhcp	1033
tls	535
ssh	354
smb	294
krb_tcp	278
ntp	172
dce_rpc	156
IPC	132
krbtgt/ACMECORP.COM	97
failed	90
smtp	63
gssapi,smb,ntlm	53
gssapi,ntlm,smb	44
gssapi,smb,krb	40
ssl,smtp	36
ssl,xmpp	34
gssapi,smb	33
smtp,ssl	31
ftp	25
LDAP/DC1.ACMECORP.com/ACMECORP.com	25
cifs/dc1.acmecorp.com	22
smb,gssapi,ntlm	21
ldap/DC1.ACMECORP.com	20
`<finance$@ACMECORP.COM>`	20
xmpp,ssl	20
gssapi	19
dce_rpc,ntlm	17
smb,ntlm,gssapi	15
krbtgt/ACMECORP	14
smb,krb,gssapi	14
smb,gssapi	13
rdp	11
cifs/DC1.ACMECORP.com	11
ldap/DC1.ACMECORP.com/ACMECORP.com	10
ntlm,gssapi,smb	10
FINANCE$	10
krbtgt/ACMECORP.com	10
host/finance.acmecorp.com	10
ldap/dc1.acmecorp.com	10
radius	9
A:	8
krb,smb,gssapi	8
ntlm,smb,gssapi	7
gssapi,krb,smb	7
ntlm,dce_rpc	7
rdpeudp	6
krbtgt/windomain.local	6
krbtgt/PODTRONICS.ORG	6
gssapi,smb,dce_rpc,krb	6
gssapi,ntlm,dce_rpc,smb	5
dce_rpc,ntlm,gssapi,smb	4
smb,gssapi,krb	4
TERMSRV/bas-ad-01.lab.local	4
ftp-data	3
http,smtp,ssl	3
gssapi,dce_rpc,ntlm,smb	3
ldap/podtronics-dc.podtronics.org	3
ssl,smtp,http	3
krb,gssapi,smb	3
dce_rpc,gssapi,ntlm,smb	3
ntlm,smb,gssapi,dce_rpc	2
dce_rpc,krb,gssapi,smb	2
smb,ntlm,gssapi,dce_rpc	2
smb,gssapi,ntlm,dce_rpc	2
gssapi,smb,dce_rpc,ntlm	2
ntlm,gssapi,smb,dce_rpc	2
krb,smb,gssapi,dce_rpc	2
gssapi,smb,ntlm,dce_rpc	2
smb,dce_rpc,ntlm,gssapi	2
spicy_ipsec_ike_udp	1
dce_rpc,ntlm,smb,gssapi	1
smb,gssapi,dce_rpc,ntlm	1
dce_rpc,smb,krb,gssapi	1
dce_rpc,smb,gssapi,ntlm	1
smb,dce_rpc,krb,gssapi	1
gssapi,dce_rpc,smb,ntlm	1
gssapi,dce_rpc,krb,smb	1

The protocols in the output contain perfectly valid protocols, including http and smb. But there are also some protocols that do not look valid. For example, there is no protocol A:, or TERMSRV/bas-ad-01.lab.local.

High or Low Session Counts

The unique ID for each session identified by Corelight (see Session Identifier (uid)) can also be an identifier for unusual network traffic. A high number of events within a unique session ID may be suspicious. The opposite is also true, a low number of events for a given session may indicate an attempt to attack that is merely probing for potential attack vectors.

Let's look at the top and bottom ten events by the unique session ID, starting with the bottom:

logscale

groupBy(uid)
| sort(order=desc,limit=10)

This generates the following summary output:

uid	_count
CTYqn61mJNPJsIVG96	15655
CDZxIM1utVOc5M1GSk	1457
CxbxKB3bGrLfxvYe4c	1229
CiRMgRsjQR7ksp7Me	688
CTyrEe2rZtqGUNnnj5	682
CLd2aI1qvBiKZ1vlTb	561
CIKUnZ1EPs7PAW2ZIi	499
CsgULS3wsvCcooOv8	479
CcY0Gu2zJP3r7iWmR	452
CBBHS11aPS4RsdHkRe	451

Let's take a closer look at that last UID:

logscale

uid=CBBHS11aPS4RsdHkRe
| groupBy([#path,service])

#path	service	_count
conn	ntp	1
ntp	<no value>	451

High Data Transfers

High transfer rates, or high amounts of data transferred for protocols or services that are normally small and discrete can be worth investigating.

DHCP requests, for example, should not normally contain excessive payloads of data, as the information returned. High data returns from a DHCP exchange might indicate a fake or spoof DHCP server masquerading on your network.

The response payload for requests is contained within the resp_bytes field. You can look for these by running a query looking for non-zero DHCP:

logscale

service = dhcp 
| top(resp_bytes)

The query returns the following data set:

resp_bytes	_count
0	947
1096	11
1644	5
300	5
548	5
2192	4
900	4
2740	3
600	3
7672	2

We can see here that the majority of DHCP requests have an empty response, but there are some that return much larger payloads that require investigating.

Integrations

Packages

Other Integrations

Log Formats

Sample Queries

Note

Identifying Event Data

Top Event Types

Top Protocols

High or Low Session Counts

High Data Transfers

Enter search term