AWS VPC Flow

This Amazon VPC package can be used to parse incoming default VPC Flow Logs from AWS.

VPC Flow Logs is a feature of AWS that enables you to capture information about the IP traffic going to and from network interfaces in your VPC. After you've created a flow log, you can retrieve and view its data in the chosen destination.

The parser normalizes data to a common schema based on CrowdStrike Parsing Standard (CPS) 1.0. This schema allows you to search the data without knowing the data specifically, and just knowing the common schema instead. It also allows you to combine the data more easily with other data sources which conform to the same schema.

Breaking Changes

This update includes parser changes, which means that data ingested after upgrade will not be backwards compatible with logs ingested with the previous version.

Note

If you are using a previous version of this package and need to continue using the queries and dashboards provided you must stay on that version.

Updating to version 1.0.0 or newer will therefore result in issues with existing queries in for example dashboards or alerts created prior to this version.

See CrowdStrike Parsing Standard (CPS) 1.0 for more details on the new parser schema.

Follow the CPS Migration to update your queries to use the fields and tags that are available in data parsed with version 1.0.0.

Installing the Package in LogScale

Find the repository where you want to send the AWS VPC Flow events, or Creating a Repository or View.

  1. Navigate to your repository in the LogScale interface, click Settings and then Packages on the left.

  2. Click Marketplace and install the LogScale package for AWS VPC Flow (i.e. aws/vpcflow).

Configurations and Sending the Logs to LogScale

  1. This package relies on ingesting Amazon VPC Flow Logs in default format. When configuring collection of logs in the default format, this package expects the log's @rawstring to be set to a values as follows:

    2 051856882135 eni-00770bc8b36e4698a 172.24.54.150 172.24.24.102 9094 46990 6 770 470591 1611840313 1611840325 ACCEPT OK
    2 051856882135 eni-00770bc8b36e4698a 172.24.24.102 172.24.54.150 46990 9094 6 622 62217 1611840313 1611840325 ACCEPT OK
    2 051856882135 eni-00770bc8b36e4698a 172.24.54.150 172.24.24.102 9094 48864 6 20 9750 1611840313 1611840325 ACCEPT OK

    If your data collection wraps the message with additional information, or the format for the VPC Flow Logs is not the default, you can adapt the parser to match your data format, spefifcailly if you have configured non-default fields for the defaut format you will need to adapt the parser.

  2. Once this configuration is completed, your logs will be automatically transferred from the AWS cloud repository to your pre-defined AWS S3 bucket.

  3. Then you need to configure LogScale to collect data from AWS S3 buckets using the vpcflow_default. See the documentation for cloud: Ingest Data from AWS S3 and self-hosted Ingest Data from AWS S3 deployments to send logs directly from S3 bucket into LogScale repository.

Verify Data is Arriving in LogScale

Once you have completed the above steps the AWS VPC Flow data should be arriving in your LogScale repository.

You can verify this by doing a simple search for to see the events:

logscale
#Vendor = "aws"
|#event.module="cloud-vpc"

Package Contents Explained

This package parses incoming data, and normalizing the data as part of that parsing. The parser normalizes the data to CrowdStrike Parsing Standard (CPS) 1.0 schema based on OpenTelemetry standards, while still preserving the original data.

If you want to search using the original field names and values, you can access those in the fields whose names are prefixed with the word Vendor. Fields which are not prefixed with Vendor are standard fields which are either based on the schema (e.g. source.ip) or on LogScale conventions (e.g. @rawstring).

The fields which the parser currently maps the data to, are chosen based on what seems the most relevant, and will potentially be expanded in the future. But the parser won't necessarily normalize every field that has potential to be normalized.

Event Categorisation

As part of the schema, events are categorized by fields:

  • event.kind

  • event.category

  • event.type

  • #event.outcome

(#event.outcome is a tag, hence the "#")

event.kind and #event.outcome can be searched using Field Filters, but event.category and event.type are arrays, so need to be searched using the following syntax:

logscale
array:contains("event.category[]", value="network")

This will find events where some event.category[n] field contains the value network, regardless of what n is.

Note that not all events will be categorized to this level of detail.

For example, the following will find events where some event.type[n] field contains the value network, regardless of what n is.

Note that not all events will be categorized to this level of detail.

Normalized Fields

Here are some of the normalized fields which are being set by this parser:

  • source.* (e.g. source.ip, source.port, source.address )

  • observer.* (e.g. observer.ingress.interface.id )

  • ecs.* (e.g. ecs.version )

  • server.* (e.g. server.ingress.interface.id )

  • error.* (e.g. error.field )

  • network.* (e.g. network.bytes,network.type,network.packets, network.iana )

  • event.* (e.g. event.type, event.start, event.category,event.action,event.module,event.kind,event.end)

  • destination.* (e.g. destination.port,destination.ip,destination.address )

Next Steps and Use Cases

You can get actionable insights from your AWS Web Application Framework by searching for suspicious activity in LogScale using the search UI, dashboards or alerts.