Falcon LogScale Collector Sizing Guide

The numbers in this guide are based on measurements and experience from running the LogScale Collector in production. However, the actual needed size of your LogScale Collector instances depends on the workloads, and we recommend testing to determine those numbers.

See the following for more information:

Minimum Resource Recommendations

In the case where the LogScale Collector is used on a laptop or desktop for gathering systems logs, the requirements are quite sparse and the service running in the background should not be noticeable.

An example of such a setup could consist of:

  • The System and Appliation channels from the Windows Event Log source

  • Log files from your VPN

  • A cmd source measuring the systems resource usage

In a scenario like this we recommend these resources as a minimum:

Resource Recommendation
Memory 4 GB
Disk 4 GB


These numbers are conservative to account for peak buffer/queue usage. During normal operations with a working network connection etc. the actual memory consumption in a scenario like above would be below 100 MB.


Generally speaking, the concurrency model behind the LogScale Collector automatically takes advantage of the systems CPU resources.

Source Throughput

Each source has different performance characteristics. The numbers for throughput are based on measurements but it will vary depending on your actual workload.

Source Throughput Notes
File 154 MB/s/vCPU Throughput of the file source is bound by disk and/or network I/O. This measurement was done with AWS io1 disks (64000 IOPS)
Journald 32 MB/s  
Syslog (TCP) 100 MB/s/vCPU The vCPUs are only utilized when multiple TCP connections are sending data to the LogScale Collector
Syslog (UDP) 26 MB/s The throughput is with UDP packages of size 1472 bytes.
Windows Event Logs 5 MB/s Measured average of around 3000 event/s. Currently the WinEventLog source does not scale automatically with numbers of vCPUs. To improve throughput, isolate high load channels to their own source in the configuration

1 vCPU = 1 ARM physical CPU or 0.5 Intel physical CPU with hyperthreading.

Sink Workers

In some high throughput scenarios the LogScale ingestion endpoint can be a bottleneck, meaning that the measured throughput of a LogScale Collector deployment is lower than expected given the table above.

In those cases it can be beneficial to increase the number of concurrent requests a sink is using to ship logs towards the LogScale ingestion endpoint.

The default number of concurrent network connections requests per sink is 4 and can be increased in the configuration, using workers:

    type: humio
    url: <..>
    token: <..>
    # Increases number of concurrent connections to LogScale to 8
    workers: 8
Sink Workers Example

How many workers to use in any situation depends on the response time per request of the LogScale server, which in turn depends on the parser used, if requests are going to an on-prem or SaaS solution, the server configuration etc.

Description #
Goal 11 TB/day = 139 MB/s
Measured server response time 600 ms

Using the default and recommended batchSize of 16 MB, the theoretical limit per worker in this example is: 1/0.600s * 16 MB = 26.66 MB/s.

Thus, the number of workers should be: 139/26.66 = 5.2, rounded up to 6 workers.

This calculation is based on the assumption that data can be read fast enough from the the source.


The memory requirement is linearly proportional to the number of sinks in the configuration plus a constant baseline requirement of 1 GB.

The default queue size per sink is 1 GB and can be increased (or decreased) in the configuration:

    type: humio
    token: <..>
    url: <..>
    # Increases queue size to 2 GiB
      type: memory
      maxLimitInMB: 2048
    type: humio
    token: <..>
    url: <..>

The configuration above therefore has a total memory requirement of 1 GB (baseline) + 2 GB (my_sink) + 1 GB (another_sink) = 4 GB.

Memory Usage Log Messages

The LogScale Collector will output the log messages below in case of high memory usage,

For sinks with default queue configuration, fullAction: pause

Table: fullAction: pause

Queue Utilization Log Level Log Message
100% Warning Memory queue is full. Sources that are sending to this sink are paused until space is available again.
80% Warning Memory queue is 80% full. If the queue becomes full, sources will be paused until there is space.
50% Warning Memory queue is 50% full. If the queue becomes full, sources will be paused until there is space.

For sinks with queue configuration, fullAction: deleteOldest.

Table: fullAction: deleteOldest

Queue Utilization Log Level Log Message
100% Info mem-queue is full, dropping oldest batch as configured
50% Warning Memory queue is 50% full. If the queue becomes full, the oldest data will be deleted to make space for new data.
80% Warning Memory queue is 80% full. If the queue becomes full, the oldest data will be deleted to make space for new data.


A running LogScale Collector which is able to deliver the logs continuously to LogScale would not normally use the resources listed above, however, some situations can cause log data to pile up - for instance if a machine is without internet connection for a while but still generates logs.

In such a scenario the LogScale Collector will back-fill the log data when an active internet connection is re-established. The internal memory buffers will fill up for efficient log shipping, and the utilization of the queue could reach 100% (This limit is by default 1 GB/sink).

In addition, if the LogScale Collector is unable to deliver the logs to the server fast enough or not at all, a large amount of memory could potentially be used.

For instance, if the LogScale Collector is tasked with back-filling 1000 large files, data will potentially be read into the systems faster than it can be delivered to the LogScale server, and in such an example the memory usage would rise to: 1 GB (baseline) + 1 GB (sink) + 1000 * 16 MB (internal buffer per file, one batch size) = 18 GB.


Disk size is only relevant if the disk queue is used. In most scenarios, When and if the disk queue makes sense depends on the deployment setup.

For instance the disk queue is unnecessary if the LogScale Collector is able to read back the data from a source in case of an interruption. This is the case for these sources: Windows Event Logs, journald and file sources. All these use a bookmarking system to keep track of how far data has been read and processed. So, essentially the disk queue only makes sense for source where such a book keeping system is impossible, which at the moment only is the syslog source.

When using the disk queue, it is usually sufficient to keep 10 minutes worth of data is usually sufficient.So, . E.g. if data flowing through a LogScale Collector deployment is averaging 40 MB/s, you should provision at least 24 GB of disk space (40 MB * 60 seconds * 10 minutes).

Example Deployments

Make sure your LogScale deployment is provisioned accordingly and meets the requirements for the ingestion amount. See Recommended Installation Architectures.

Large Syslog (TCP) deployment - 10TB/day
  • 10 TB/Day = 121.4 MB/s (121.4 MB/s)

  • (100 MB/s/vCPU) = 1.21 vCPUs, rounded up to 2 vCPUs

  • Recommended m6i.xlarge with 4 vCPUs to account for spikes in traffic and possible backpressure from network

Table: Large Syslog Source

Software Instances EC2 Instance Type / vCPU Memory Storage
LogScale Collector 1 m6i.xlarge / 4 16 GB gp2

Medium Windows Event Logs Deployment - 1 TB/Day
  • By isolating the ForwardedEvents channel to its own source in the configuration, it is possible to get a throughput of roughly 10 MB/s on an instance.

  • 1 TB/Day = 12.14 MB/s

  • (12.14 MB/s) / (10 MB/s/instance) = 1,2 instance rounded up to 2

Table: Medium Windows Event Source

Software Instances EC2 Instance Type / vCPU Memory Storage
LogScale Collector 2 m6i.large / 2 16 GB gp2

Large File Source Deployment - 1 TB/Day
  • 100 TB/Day = 1214 MB/s

  • (1214 MB/s) / (154 MB/s/vCPU) = 7,9 vCPUs, rounded up to 8.

  • Since 1214 MB/s is more than the max throughput of AWS io1 volumes of 1000 MB/s, we go with two instances.

Table: Large File Source

Software Instances EC2 Instance Type / vCPU Memory Storage
LogScale Collector 2 m6i.xlarge / 4 vCPU 16 GB io2