Ingest Data from AWS S3

Security Requirements and Controls

CrowdStrike Falcon LogScale enables organizations to ingest and manage various AWS log types (including VPC flow, CloudTrail, and CloudWatch) from S3 buckets through Amazon Simple Queue Service (SQS), providing scalable data ingestion capabilities with comprehensive monitoring features. The documentation covers the complete workflow from prerequisites and configuration requirements to monitoring ingest feed status, managing configurations, and troubleshooting ingestion errors, ensuring efficient log data management within the LogScale environment.

Falcon LogScale can ingest logs from AWS S3 buckets, which can then be managed in Falcon LogScale and leveraged using queries, alerts and alarms. In the following we will run through the configuration process of ingesting this data.

Falcon LogScale will consume from the SQS queue and scale ingest based on the number of messages on the SQS queue.

There will typically be some latency between the events occurring to them being available, both from the side producing the events (for example, AWS CloudTrail) and from the LogScale consumer. Reconfiguring an ingest feed will reset the scaling, and may therefore temporarily increase the latency until LogScale scales back.

Amazon Web Services log data is an extremely valuable data source that comes in a variety of flavors depending on the services you are looking to learn more about. Some of the most common data sources include AWS VPC flow™, CloudTrail™ and CloudWatch™. These logs can be directed to S3 buckets where they are often ingested by LogScale.

Ingesting data using this method operates through the use of the Amazon Simple Queue Service (SQS) to provide the information about which S3 buckets to read for data, as shown in the following diagram:

graph LR %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% s3[("Amazon S3")] Q[SQS Topic] QS[SQS Subscriber] LS[LogScale] s3 <--> LS Q --Message--> QS QS --> LS

During process:

  1. LogScale reads a message from the SQS queue containing the information about the S3 bucket where the data is located.

  2. LogScale downloads the raw data from S3, and parses and ingests the files in the bucket.

  3. LogScale removes the message from the queue to preview re-processing the material.

For more details on these logs, see here.

Prerequisites for Ingesting AWS Data

To follow these steps, you will need:

  • Access to AWS and basic knowledge of AWS architecture.

  • To configure your source in AWS to log to an S3 bucket refer to documentation. This can be to a separate bucket or a directory within a bucket. These log files will then be pulled into Falcon LogScale for analysis and visualization, the format of the data can be line-delimited or AWS JSON events. AWS example events are referenced here.

  • Access to a Falcon LogScale environment, with a repository where you want to ingest the data.

  • Change ingest feed permission

  • Performed the S3 Ingest Self-hosted Preparation before using ingest feeds in self hosted scenarios.

Once these requirements are met, you are ready to follow Set up a New AWS Ingest Feed and configure your ingest feeds.