Troubleshooting FDR Ingest
Duplicate Messages in the SQS Queue
If the same message appears in the SQS queue more than once, make sure your consumer script reads, processes, and explicitly deletes the SQS message within the visibility timeout period (typically two hours). If, within the timeout period, the SQS message is not downloaded or doesn't process it, the message returns to the queue to be consumed again.
If the consumer script used is based on the sample that CrowdStrike
provides,
data_replicator_sample_consumer.py,
be sure the msg.delete() call is
not commented out. Also be sure in the
data_replicator_config.py
configuration file for the sample script that the
VISIBILITY_TIMEOUT value is enough
time for your consumer to process any downloaded files and delete the
SQS message.
Duplicate messages might start to appear as the result of an increase in the volume of events. The extra events produce more files per SQS message, which in turn increases the processing time of the data in a SQS message.
FDR Ingest Lag
If you notice a large lag (i.e., more than two hours) between FDR event creation and ingest time in LogScale, you may need to check and adjust the fileDownloadParallelism setting. You can do this using the GraphQL API, which can be accessed from the LogScale UI with the API Explorer.
The query here uses the repository field to check the value of fileDownloadParallelism:
query {
repository(name: "reponame") {
fdrFeedControl(id: "fdrFeedId")
{ id, maxNodes, fileDownloadParallelism }
}
}A null result means that fileDownloadParallelism is set to 1, which is the default. To set it to some other value, run the following GraphQL mutation:
mutation {
updateFdrFeedControl(
input: {
repositoryName: "reponame",
id: "fdrFeedId",
fileDownloadParallelism: { value: 8 }
}
)
{ id, fileDownloadParallelism, maxNodes}
}The mutation above sets fileDownloadParallelism to 8. This means LogScale will use at most eight threads to download files. It will most likely use all eight threads during working hours as those messages contained many files. However, setting the value too high may affect other ingests, queries, and so on. Should this happen, try setting it to a lower value and then monitor the number of messages in the queue.
If maxNodes is 2 and fileDownloadParallelism is 2, then the total amount of files across the entire cluster downloaded in parallel is four (i.e., 2 * 2). The maxNodes impacts the amount of SQS messages that are processed in parallel; a message can contain multiple files. What values are most suitable depends on the structure of the SQS messages. The number of files per SQS message can determine what fileDownloadParallelism should be.