Configuring Falcon Logscale Collector
The Falcon Logscale Collector is configured through a yaml configuration file which can be found in:
Linux
/etc/humio-log-collector/config.yaml
Windows
C:\\Program Files (x86)\\CrowdStrike\\Humio Log Collector\\config.yaml
Additional environment variables can be configured in this file
/etc/default/humio-log-collector
. on Linux. On
Windows the environment variables have to be configured in system
properties
Editing the Configuration
These steps explain how to configure the config.yaml file to ship data to Humio.
Open the file
config.yaml
to edit using the editor of your choice, for example on Linux:logscalesudo vi /etc/humio-log-collector/config.yaml
Edit the file and specify the fields and values described in Possible Sources and Example Configuration Files or you can try data ingestion by specifying:
name
under
sources
you must specifytype
andinclude
under
sinks
you must specifytype
,token
andurl
Save the changes and restart the service.
sudo systemctl restart humio-log-collector.service
Minimal Configuration Example File Collection
This configuration is the minimal configuration needed to collect
events from local log files. The sources
section describes the data that should be collected, and the
sinks
section describes where those events
should be sent. The sinks can be reused and are referenced by name in
the source.
dataDirectory: data
sources:
apache_logs:
type: file
include: /var/log/apache/*.log
sink: my_humio_instance
sinks:
my_humio_instance:
type: humio
token: <ingest-token>
url: https://cloud.community.humio.com
Note
You must set the url and token values that correspond to your Humio instance and repository.
Possible Sources and Example Configuration Files
The following sections details the specific configurations for each sources type along with example configuration files additionally you can find a description of the fields below each example .
fleetManagement:
token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
## Change the URL if needed to reflect your LogScale URL.
url: https://cloud.us.humio.com
dataDirectory: data
sources:
apache_logs:
type: file
# Glob patterns
include: /var/log/apache/.log
exclude: /var/log/apache/not_me.log
sink: my_humio_instance
parser: accesslog
multiLineBeginsWith: ^20\d{2}-
transforms:
# static_fields transform adds configured key, value pairs as fields
- type: static_fields
fields:
mykey: myvalue
# Passing environment variables is supported
myenvvar: $MY_ENV_VAR
sinks:
my_humio_instance:
type: humio
token: <ingest-token-repo2> or an environment variable
url: https://cloud.us.humio.com
compression: gzip
compressionLevel: 9
tls:
insecure: false
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
caFile: /etc/ssl/cert.pem
proxy:none
queue:
fullAction: deleteOldest
memory:
flushTimeOutInMillisecond: 200
maxLimitInMB: 1024
File Source
Whentype
is set to file
the following configurations:
If
type
is set tofile
the include and exclude fields must be specified.include
Specify which logs to include by specifying the path of the file or using a glob pattern.
exclude
Specify which logs to exclude, also using a glob pattern, this is only applied to type file.
inactivityTimeout
Specify the period of inactivity (not written for a configurable period default: 60 seconds) for a file being monitored before the file descriptor is closed to release system resource, and watched for changes instead. Whenever the file changes, it is re-opened.
parser
Specify the parser to use to parse the logs, if you install the parser through a package you must specify the type and name as displayed on the parsers page for example linux/system-logs:linux-filebeat.
multiLineBeginsWith
ormultiLineContinuesWith
The file input can join consecutive lines together to create multiline events, by using a regular expression. It can be configured to use a pattern to look for the beginning or the continuation of multiline events.
Example all multiline events beginning with a date, e.g. 2022-
multiLineBeginsWith:
in this case every line that doesn't match the pattern, gets appended to the latest line that did.^20\d{2}-
Example lines that start with whitespace are continuations of the previous line
multiLineContinuesWith:
in this case every line that matches the pattern, gets appended to the latest line that didn't.^\s+
transforms
Specify transforms to use for this source (optional), if
static_field
is specified you must specify a key and a value which can be an environment variable for examplemyenvvar:$MY_ENV_VAR
fleetManagement:
token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
## Change the URL if needed to reflect your LogScale URL.
url: https://cloud.us.humio.com
dataDirectory: data
sources:
syslog:
type: syslog
# Mode must be 'udp' or 'tcp'
mode: udp
# Port number to listen on
# Default: 514
port: 514
# Optional bind address.
# If unspecified the source will listen on all interfaces
# Don't specify port here. Use 'port' field for that
bind: 0.0.0.0
sink: my_other_humio_instance
sinks:
my_other_humio_instance:
type: humio
token: <ingest-token_repo1>
url: https://cloud.us.humio.com
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
caFile: /etc/ssl/cert.pem
proxy: none
queue:
fullAction: deleteOldest
memory:
flushTimeOutInMillisecond: 200
maxLimitInMB: 1024
Syslog Source
Iftype
is set to syslog
you must specify the port
,
address
and mode
fields.
port
Specify the number of the port on which to listen. The default is 524.
address
Specify the address to bind to. This defaults to all addresses.
mode
Specify the protocol to listen to, which can be tcp or udp.
fleetManagement:
token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
## Change the URL if needed to reflect your LogScale URL.
url: https://cloud.us.humio.com
## Keep this option as "none" unless you actually need a proxy.
proxy: none
## The TLS option can be uncommented if you're using a self-signed certificate.
#tls:
#insecure: true
dataDirectory: C:\ProgramData\CrowdStrike\Humio Log Collector\
sources:
windows_events:
type: wineventlog
## Add other channels by simple adding additional "name" lines.
## The following command can be used to find other channels:
## Get-WinEvent -ListLog * -EA silentlycontinue | sort-object -Property Recordcount -desc
channels:
- name: Application
excludeEventIDs: [ 11 ]
- name: Security
- name: System
- name: Windows PowerShell
## You can manually specify a parser to be used here.
## This overrides the parser specified in the LogScale UI.
#parser: myparser
includeXML: false
sink: humio
sinks:
humio:
type: humio
token: 2eXXXXXX-81d1-XXXX-bc22-05e430XXXXXX
## Change the URL if needed to reflect your LogScale URL.
url: https://cloud.us.humio.com
## Keep this option as "none" unless you actually need a proxy, this must be set to none if fleet Management is enabled.
proxy: none
## The TLS option can be uncommented if you're using a self-signed certificate.
#tls:
#insecure: true
## This increases the maximum single event size to 8 MB. You can change as needed.
maxEventSize: 8388608
## Uncomment if you would like to force a specific level of gzip compression. 9 is the highest.
#maxBatchSize: 16777216
#compression: gzip
#compressionLevel: 9
Windows Event Log Source
Iftype
is set to
wineventlog
you must specify the
channel
.
channel
Specify the windows event log channels to read, if no channel is specified the log collector will subscribe to all available channels. You can also specify IDs usingonlyEventIDs
or exclude specific event IDs usingexcludeEventIDs
.Important
Subscribing to all channels may impact performance as the amount of data logged would be very high.
yamlchannels: - <Channel Name> - ...
includeXML
set to false to exlude XML files from the source.providers
specify an array of provider names to filter events by provider.parser
Specify the parser to use to parse the logs, if you install the parser through a package you must specify the type and name as displayed on the parsers page for example linux/system-logs:linux-filebeat, see Parsers for more information.
Important
Override proxy configuration for the sink. Must be set to
none
for Windows Server and if fleet management
is enabled.
fleetManagement:
token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
## Change the URL if needed to reflect your LogScale URL.
url: https://cloud.us.humio.com
dataDirectory: data
sources:
journal:
# Example for reading journald log data (linux only)
type: journald
sink: my_humio
# Optional. If not specified collect from the local journal
directory: /var/log/journal
# If specified only collect from these units
includeUnits:
- systemd-modules-load.service
# If specified collect from all units except these
excludeUnits:
- systemd-modules-load.service
# Default: false. If true only collect logs from the current boot
currentBootOnly: false
sinks:
my_humio:
type: humio
token: <ingest-token-repo2> or an environment variable
url: https://cloud.us.humio.com
compression: gzip
compressionLevel: 9
tls:
insecure: false
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
caFile: /etc/ssl/cert.pem
proxy: none
queue:
fullAction: deleteOldest
memory:
flushTimeOutInMillisecond: 200
maxLimitInMB: 1024
Journal Source
type
is set to Journald
in
order to read JournalD log data (linux only) you must specify the
following fields:
directory
Allows you to specify the journal directory to collect from, if not specified collects from the local journal.
includeUnits
If specified only collect from these units
excludeUnits
If specified collect from all units except these.
currentBootOnly
Set to false by default. If true only collect logs from the current boot.
fleetManagement:
token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
## Change the URL if needed to reflect your LogScale URL.
url: https://cloud.us.humio.com
dataDirectory: data
sources:
cmd_ls:
type: cmd
cmd: ls
# scheduled or streaming
mode: scheduled
args:
- -l
- -h
workingDir: /foo
# Interval between each invocation of the cmd
interval: 60
# Output mode when using mode 'scheduled'. Either 'streaming' (default) or 'consolidateOutput'.
# When outputMode is set to 'consolidateOutput', the entire output of the scheduled command is sent as a single event.
# outputMode: consolidateOutput
# Environment variables can be configured and passed to the command
environment:
# define CONFIGURED_ENV1 as environment variable
CONFIGURED_ENV1: my_configured_env_1
# Pass environment variable: MY_ENV_VAR to command
MY_ENV_VAR: $MY_ENV_VAR
sink: my_humio
cmd_tail:
type: cmd
cmd: tail
mode: streaming
args:
- -F
workingDir: /foo
sink: my_humio
sinks:
my_humio:
type: humio
token: <ingest-token-repo2> or an environment variable
url: https://cloud.us.humio.com
compression: gzip
compressionLevel: 9
tls:
insecure: false
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
caFile: /etc/ssl/cert.pem
proxy: none
queue:
fullAction: deleteOldest
memory:
flushTimeOutInMillisecond: 200
maxLimitInMB: 1024
Exec Source
Iftype
is set to cmd
you
must specify the fields:
cmd
Specify the command to run.
mode
Can be set to
scheduled
to collect data at intervals in which case you must specify theinterval
orstreaming
to collect data constantly. To create a single multiline event when running in the schedule mode set the optionconsolidateOutput
to true.args
The arguments of the command.
workingDir
Specifies the directory in which to run the command.
interval
Specifies how frequently the command should be invoked when set to
scheduled
.environment
Specify the Environment variables and pass them command to commands using this section.
sink
Set to humio.
fleetManagement:
token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
## Change the URL if needed to reflect your LogScale URL.
url: https://cloud.us.humio.com
dataDirectory: /var/lib/humio-log-collector
sources:
# Collect local files.
var_log:
type: file
include: /var/log/*
exclude: /var/log/*.gz
sink: humio
# Collect syslog udp 5140.
syslog_udp_5140:
type: syslog
mode: udp
port: 5140
sink: humio
# Collect syslog tcp 5140.
syslog_tcp_5140:
type: syslog
mode: tcp
port: 5140
sink: humio
sinks:
humio:
type: humio
# Replace with your specified ingest token.
token: $INGEST_TOKEN
# Replace with your "standard endpoint" API URL: https://library.humio.com/endpoints/
url: $HUMIO_URL
File Linux Source
This configuration example which uses the file source with specific values for collecting var logs. See Common Configuration Elements for information on the common elements in the configuration file.Common Configuration Elements
Configuration Elements that apply to all log sources.
Fleet Management (fleet management
)
The fleet management block configures instances of the log collector to work on the Log Collector Fleet Management
fleetManagement:
token: 4b09c4f7-2364-605t-a55f-d5d2fg881d66
url: https://cloud.us.humio.com
token
This key specifies the token which instances of the log collector to be visualized on the Log Collector Fleet Management page.
url
URL of the humio installation where the fleet management page is hosted.
Note
Proxy must be set to none unless except for Linux use cases.
Sources (sources
)
The sources block configures the sources of data that the log collector will send to Humio.
type
This key specifies the type of log, possible values are
file
,syslog
,journal
,cmd
, andwineventlog
see Possible Sources and Example Configuration Files for more information.
Sinks (sinks
)
The sinks
block configures the sinks that are
used by the source or sources.
sinks:
my_other_humio_instance:
type: humio
token: <ingest-token_repo1>
url: https://cloud.us.humio.com
my_humio_instance:
type: humio
token: <ingest-token-repo2> or an environment variable
url: https://cloud.us.humio.com
# maxEventSize (default 1MB) sets the limit for a single event in bytes, if exceeded the event will be truncated.
maxEventSize: 1048576
# maxBatchSize (default: 16 MB), sets the maximum size in bytes of a batch which is sent to the configured sink.
# This includes fields as well as event data. If exceeded data will be sent in a subsequent batch.
maxBatchSize: 16777216
# auto, none, gzip, deflate, none. Default: auto
compression: gzip
# Number between: 1 ... 9.
# 1 = highest speed
# 9 = highest compression
# If unspecified or 0 the default value for the compression algorithm specified in compression is used
compressionLevel: 9
# Override default tls configuration
# Only one of the following options should be used at a time.
# If multiple are given, the precedence is: 'insecure', 'caCert', 'caFile'.
tls:
# Specify insecure to skip certificate validation
insecure: false
# Specify caCert to load a PEM certificate from the config file
caCert: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
# Specify caFile to load PEM certificate from an external file.
caFile: /etc/ssl/cert.pem
# Override proxy configuration for the sink. Must be set to 'none' for Windows Server and fleet management.
# Accepted values: 'system', 'none' or a URL such as: http://127.0.0.1:3129 for an http proxy.
# Defaults to system, which will try to determine the appropriate proxy or fallback to none.
proxy: none
type
Specify the type of sink. This must be set to Humio.
token
Specify the Ingest Tokens for your Humio repository or an environment variable.
url
Specify the url of your Humio account for example https://cloud.humio.com.
maxBatchSize
Specifies the max size of batch (default 16MB) and works along with the maximum events per request. The limits are propagated to the queue and replace the
maxEventsPerRequest
. The limits are also propagated to all the sources that reference the sink.maxEventsPerRequest
Specify the max number of events per request by size (Default 1MB) and works with
maxBatchSize
.compression
Specify the type of data compression, possible values: auto, none, gzip, deflate. The default value is auto.
compressionLevel
Specify the level of compression where 1 is best speed and 9 is best compression, if set to undefined or 0 the default value for the compression algorithm specified in compression is applied.
tls
This object contains details on the PEM certificates. this section allows you to override the defaults. Only one of the following options should be specified:
insecure
Specify if certificate validation is needed, if set to true the certificate validation is skipped.
caCert
Specify this key to load a certificate from the config file.
caFile
Specify this key to load the PEM certificate from an external file.
proxy
Set to
none
for Windows Server or you can specify, if required, an override proxy configuration for the sink, possible values: 'system', 'none' or a URL such as: http://127.0.0.1:3129 for a http proxy. The default is system, which will try to determine the appropriate proxy or fallback to none.
Queue (queue
)
The queue
block is part of the
Sinks (sinks
)
and configures the behaviour of the queue.
Note
The memory queue no longer supports configuration of
maxEventsPerRequest
, it inherits the maximum
bytes per request from the sink maxBatchSize
.
queue:
# Default: 1024
# fullAction determines queue behavior when it is full.
# pause = queue pauses ingesting new batches if it is full (Default if not mentioned) deleteLatest is no longer support and automatically set to pause.
# deleteOldest = queue deletes the oldest batch to accept new batches if it is full
# Default: pause
fullAction: deleteOldest
memory:
# Default: 1000
flushTimeOutInMillisecond: 200
# Default: 2048
maxLimitInMB: 1024
type
This object defines how memory is managed and can be set to:
memory
default, ThemaxLimitInMB
can be set but is set to 1024mb by default.disk
when set to queue the data is written in the
unless specified usingdataDirectory
/queue/sinkName
/storageDir
. ThemaxLimitInMB
must be set to the maximum size of the queue when set to disk, by default set to 1024.
fullAction
Specify the action to take when the queue is full. The possible values are:
deleteOldest
accepts new batches but deletes the oldest batchpause
this is the default value. the queue does not ingest new batches when it is full. Note that deleteLatest is no longer supported and automatically set to pause.
flushTimeOutInMillisecond
Specify how often data is sent to humio log shipper. The default is 1000.
Disabling Updates
By default, the log collector is automatically updated however, if you have connection issues or the server on which you are installing the log collector is not connected to the internet, you may need to disable automatic updates.
LOG_COLLECTOR_UPDATE_SERVER=disabled
Set the server setting to
disable
In this case, updates are disabled. This is useful in airgapped environments.Not set. In this case, Logscale uses our update server via a URL defined in the code.
Set to a specific URL. In this case, we will connect to the specified URL for updates.
Checkpoints
By default, the configuration file points to the directory
var/lib/humio-log-collector
as the storage for
checkpoints.
Stop the Log Collector service humio-log-collector.service.
Delete the
checkpoints.json
file to reset the state of the installation.Restart the Humio Log Collector service.
Troubleshooting
You can troubleshoot the Falcon Logscale Collector using Using Console Stderr or the Debug Log.
Using Console Stderr
The Log Collector sends information to stderr if run from the CLI, the information is sent using JSON format and the detail level is controlled by the log-level. The log-level can be specified using two different approaches (highest priority first):
Using a command line argument: -log-level debug
Configuring a log-level in the config file (yaml):
logLevel: debug
The following log-levels are supported:
trace (highest verbosity)
debug
info
warn
error (default)
fetal
The -log-pretty command line argument enables pretty-printing of console output for all logs, it has no effect on logs sent to Humio, they use JSON format.
Debug Log
The Falcon Logscale Collector debug log can be sent to a Humio
instance by setting the HUMIO_DEBUG_LOG_ADDRESS
and
HUMIO_DEBUG_LOG_TOKEN
environment variables:
HUMIO_DEBUG_LOG_ADDRESS=https://<your-humio-instance>
HUMIO_DEBUG_LOG_TOKEN=<your-ingest-token>
The default log-level for info sent to the Humio instance is trace, it is possible to change this with an environment variable:
HUMIO_DEBUG_LOG_LEVEL=<desired-log-level>
To stop sending the debug log the environment variables need to be undefined.