The following sections detail the specific configurations for each sources
type along with example configuration files. Additionally, you can find a
description of the fields below each example.
Show:
File
yaml
# Define the sink (destination) for the logssinks:logscale_sink:type:logscale# Using LogScale as the destinationurl:"https://cloud.humio.com/"# Replace with your LogScale instance URLtoken:"${LOGSCALE_TOKEN}"# Use environment variable for the ingest token# Configure the queue for buffering eventsqueue:type:memory# Use a memory-based queuemaxLimitInMB:64# Set the queue size to 64 MB# The queue size is reduced to 64 MB because the input is read from# persistent files. In case of a shutdown or network issues, the# collector can resume reading from where it left off, reducing the# need for a large buffer. This helps optimize memory usage while# still providing adequate buffering for most scenarios.# Define the source for Apache access logssources:apache_access_logs:type:file# File-based sourceinclude:-"/var/log/apache2/access.log"# Path to Apache access log file# You can add multiple log files if needed# - "/var/log/apache2/other_access.log"# Optional: Exclude specific files or patterns# exclude:# - "/var/log/apache2/access.log.1"# - "/var/log/apache2/excluded_access.log"# Optional: Exclude files with specific extensions# excludeExtensions:# - "gz"# - "zip"# Configure multiline parsing if needed# multiLineBeginsWith: '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'# Reference the sink defined abovesink:logscale_sink# Optional: Specify a parser to be used in LogScale# parser: "apache_combined"# Add static fields to all events from this source (optional)# transforms:# - type: static_fields# fields:# log_type: "apache_access"# environment: "${ENV}" # Use an environment variable
File Source
The file source allows you to ship logs from file sources using glob
patterns and it also allows gzip and bzip2 compressed formats. When
type is set to
file the following configurations
apply:
Specify the file paths to exclude when collecting data. This field supports environment variable expansions. To use an environment variable, reference it using the syntax ${VAR}, where VAR is the name of the variable. The {}-braces may be omitted, however in that case the variable name can only contain: [a-z], [A-Z], [0-9] and "_".
Specify the file extensions to exclude when collecting data. Some file extensions are automatically ignored even if they match an included pattern: xz, tgz, z, zip, 7z. To include all formats set excludeExtensions to an empty array. This will have the effect that files will not be decompressed before ingest.
Specify the period of inactivity in seconds for a file being monitored before the file descriptor is closed to release system resource. Whenever the file changes, it is re-opened and the timeout restarted.
Specify the file paths to include when collecting data. This field supports environment variable expansions. To use an environment variable, reference it using the syntax ${VAR}, where VAR is the name of the variable. The {}-braces may be omitted, however in that case the variable name can only contain: [a-z], [A-Z], [0-9] and "_".
The file input can join consecutive lines together to create multiline events by using a regular expression. It can be configured to use a pattern to look for the beginning or the continuation of multiline events.
Example all multiline events beginning with a date, e.g. 2022 you would use:
yaml
multiLineBeginsWith:^20\d{2}-
in this case every line that doesn't match the pattern, gets appended to the latest line that did.
The file input can join consecutive lines together to create multiline events by using a regular expression. It can be configured to use a pattern to look for the beginning or the continuation of multiline events. Lines that start with whitespace are continuations of the previous line. For example, to concatenate lines indented by whitespace (instead of starting at column 0):
yaml
multiLineContinuesWith:^\s+
In this case every line that matches the pattern, gets appended to the latest line that didn't.
Specify the parser within LogScale to use to parse the logs, if you install the parser through a package you must specify the type and name as displayed on the parsers page for example linux/system-logs:linux-filebeat. If a parser is assigned to the ingest token being used this parser will be ignored.
[a] Optional parameters use their default value unless explicitly set.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.
File Rotation Support
The Falcon LogScale Collector strives to support all kinds of file rotation.
The Collector fingerprints files larger than 256 bytes and
increases the fingerprint block size up to 4096 bytes, as
applicable.
The Collector supports rotation using the following methods:
rename
compression
truncation
Where rename and compression files are detected as duplicates.
Compressed files are considered static. Renamed files keep their
fingerprints and further updates are supported. When files are
truncated, the read offset is set to the new size, which may be
0 or non-zero. In the situation where the file is truncated
followed by a quick update, the read offset depends on the time
between the write and the processing of the event.
Reading Compressed Files
The Falcon LogScale Collector supports reading gzip and bzip2 compressed files.
If gzip or bzip2 compressed files are matched by the configured
include patterns, these will be auto detected as gzip/bzip2 files
(using the magic number at the beginning of the file), decompressed
and ingested.
By default files with the following extensions will be
ignored/skipped even if they match a configured include pattern:
.xz
.tgz
.z
.zip
.7z
File extensions to ignore/skip can be configured with the
excludeExtensions config
option. The default is:
If excludeExtensions is
set to an empty array, it is possible to override the default
setting. These files will not be decompressed before ingest. For
example:
yaml
excludeExtensions: []
Effectively sends files in the compressed format.
If it for some reason is desired to exclude gzip and bzip files in
addition to the other excluded file extensions, the following option
can be used (provided the compressed files are named
*.gz,
*.bz2):
## This is YAML, so structure and indentation is important.## Lines can be uncommented by removing the #. You should not need to change the number of spaces after that.## Configuration options have a single #, comments have a ##. Only uncomment the single # lines if you need them.###### Define the sink (destination) for the logssinks:logscale_sink:type:logscale# Using LogScale as the destinationurl:"https://cloud.humio.com/"# Replace with your LogScale instance URLtoken:"${LOGSCALE_TOKEN}"# Use environment variable for the ingest token# Configure the queue for buffering eventsqueue:# It is recommended to use a disk queue to persist syslog messages,# ensuring data integrity during network issues or system restarts.type:disk# Use a disk-based queue for persistencemaxLimitInMB:10240# Set the queue size to 10 GB (10 * 1024 MB)# A large disk queue is used to ensure data persistence and handle# high volumes of incoming syslog data, providing a robust buffer# against network issues or temporary outages.# fullAction: deleteOldest# Uncomment the line above to delete the oldest events when the queue is full.# This can be useful in high-volume environments where it's preferable to# lose some old data rather than pause ingestion of new data. However, use# this option with caution as it can result in data loss.# Define the sources for syslog datasources:syslog_udp:type:syslogmode:udp# UDP syslogport:514# Standard syslog portsink:logscale_sink# Optional: Bind to a specific address# bind: "0.0.0.0"# Optional: Set the maximum event size (in bytes)# maxEventSize: 1048576 # 1 MB# The default maxEventSize is 2048 bytes. Increase this value if you expect# larger syslog messages. Be cautious when increasing this value, as it# affects memory usage and network bandwidth.# Optional: Set the number of worker threads (Linux only)# workers: 4# The 'workers' option controls the number of threads used to read syslog messages.# By default, it uses the number of CPU cores available on the system.# Adjust this value based on your system's capabilities and the expected message volume.# Optional: Configure the parser to be used in LogScale# parser: "syslog_rfc5424"# Optional: Add static fields# transforms:# - type: static_fields# fields:# source_type: "syslog_udp"# environment: "${ENV}"syslog_tcp:type:syslogmode:tcp# TCP syslogport:1514# Using a different port for TCPsink:logscale_sink# Optional: Bind to a specific address# bind: "0.0.0.0"# Optional: Set the maximum event size (in bytes)# maxEventSize: 1048576 # 1 MB# The default maxEventSize is 2048 bytes. Increase this value if you expect# larger syslog messages. Be cautious when increasing this value, as it# affects memory usage and network bandwidth.# Optional: Enable strict parsing for TCP# strict: true# When strict parsing is enabled, the connection will be closed if an# invalid message is encountered. This helps maintain data integrity# but may result in lost messages if the client doesn't handle reconnection properly.# Optional: Support RFC6587 octet counting# supportsOctetCounting: true# Optional: Configure the parser to be used in LogScale# parser: "syslog_rfc5424"# Optional: Add static fields# transforms:# - type: static_fields# fields:# source_type: "syslog_tcp"# environment: "${ENV}"
Syslog Source
If type is set to
syslog you must specify the
port and
mode fields.
Maximum allowed syslog event size; syslog events larger than this will be truncated. If maxEventSize is also defined at sinks level the lower of the two values will be applied. Set this to the max value to avoid truncation issues.
UDP only. Specifies how many workers to use, you can set to 1 keep the 1.5 behavior, or to a value to override auto scale to CPU cores.
[a] Optional parameters use their default value unless explicitly set.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.
Windows Event Log Example
yaml
## This is YAML, so structure and indentation is important.## Lines can be uncommented by removing the #. You should not need to change the number of spaces after that.## Config options have a single #, comments have a ##. Only uncomment the single # lines if you need them.#####sources:windows_events:type:wineventlog## Add other channels by simple adding additional "name" lines.## The following command can be used to find other channels:## Get-WinEvent -ListLog * -EA silentlycontinue | sort-object -Property Recordcount -descchannels:-name:ApplicationexcludeEventIDs: [903, 900]
-name:SecurityonlyEventIDs: [4624,4634,4672]
-name:Systemproviders:-"Microsoft-Windows-NLB"levels: [0,1,2,3]
-name:ForwardedEvents-name:Applicationtype:queryquery:*[System[Level<4]andSystem/Provider[@Name="Microsoft-Windows-Security-SPP"]]-name:CustomQueryXMLtype:queryquery:|
<QueryList>
<Query>
<Select Path="Application">
*[System[Level>0] and System/Provider[@Name="Microsoft-Windows-Security-SPP"]]
</Select>
<Suppress Path="Application">
*[System[EventID=1004]]
</Suppress>
</Query>
</QueryList>
## You can manually specify a parser to be used here.## The parser aside to the ingest token will override this parser setting#parser: SampleWindowsParser## Set language to en-USlanguage:1033## Don't send the raw XMLincludeXML:falsesink:logscalesinks:logscale:type:humiotoken:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX## Replace with the "Ingest URL" on the FLC download page. It must include the "https://" at the beginning.url:https://your.logscale.cluster
Windows Event Log Source
If type is set to
wineventlog you must specify the
channel.
channels Specify the windows event log channels to read, if no channels are specified the log collector will subscribe to all available channels. To add a filter, choose one of the options below for either named channels or a query.
Optional fields for named channels:
onlyEventIDs read only specific IDs
excludeEventIDsexclude specific IDs
providersread only specific providers
levels read only specific severity levels
Important
Subscribing to all channels may impact performance as the amount of data logged would be very high.
Example:
yaml
channels: - name:<channelname>-name:...
Fields when using a query:
type: query enables query mode
query The XML or XPath query. Use "|" character to enable indented multiline queries for improved readability.
Specify the language for the event message, collected as @rawstring using Windows LCID language code. This only applies for rendering of the event message (no other values) and only for local events.
In the case of forwarded events the message is rendered locally by the Windows Event Forwarded, and when collected on the Windows Event Collector, the message is plain text to the Falcon LogScale Collector.
The default setting is 0, which corresponds to the previous behaviour, which is the active language on the host.
Name of the configured sink that should be sent the collected events.
[a] Optional parameters use their default value unless explicitly set.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.
Journal
yaml
sources:journal:# Example for reading journald log data (linux only)type:journaldsink:my_humio# Optional. If not specified collect from the local journaldirectory:/var/log/journal# If specified only collect from these unitsincludeUnits:-systemd-modules-load.service# If specified collect from all units except theseexcludeUnits:-systemd-modules-load.service# Default: false. If true only collect logs from the current bootcurrentBootOnly:falsesinks:my_humio:type:humiotoken:<ingest-token-repo2>oranenvironmentvariableurl:https://cloud.us.humio.comcompression:gzipcompressionLevel:9tls:insecure:false-----BEGINCERTIFICATE-----...-----ENDCERTIFICATE-----caFile:/etc/ssl/cert.pemproxy:nonequeue:fullAction:deleteOldestmemory:flushTimeOutInMillisecond:200maxLimitInMB:1024
Journal Source
type is set to
Journald in order to read JournalD
log data (linux only) you must specify the following fields:
If specified the Collector will only collect from these units.
sink
string
required
Name of the sink, which you configured in sinks, that should be sent the collected events.
[a] Optional parameters use their default value unless explicitly set.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.
Syslog via TLS
yaml
sources:MySourceName:type:syslog_tls# Required: PEM certificate file.certificateFile:cert.pem# Required: PEM private key for the certificate.keyFile:privkey.pem## Optional: Max allowed event size (default = 2048 bytes) messages larger than this will be truncated## NOTE: Setting maxEventSize above the max allowed value will cause the FLC service to not start# maxEventSize: 1048576## Optional: Receive buffer size. Defaults to 16x maxEventSize dynamically.## NOTE: receiveBufferSize must be set higher than the maxEventSize value otherwise FLC service won't start# receiveBufferSize: 16777216## Optional: Enable strict event handling. Events that don't start with '<' or an octet counting header are discarded and the connection is closed.# strict: false## Optional: Port number to bind to. Default 6514.# port: 6514## Optional: Address to bind to. Default "", which is all addresses.# bind: "127.0.0.1"## No client validation, default if section is omitted.# clientAuthentication:# type: none## Verify client via CA cert:# clientAuthentication:# type: ca# caFile: ca.pem## Verify client via cert fingerprint:# clientAuthentication:# type: fingerprint# fingerprints:# - sha-1:bf:88:e7:9e:58:04:d6:85:e6:06:2e:e0:de:d1:3c:44:cd:33:b6:ba# - sha-256:89:83:8E:56:61:EC:D4:BF:ED:DA:88:2B:A4:8A:27:25:EF:B5:39:F9:5E:59:2D:CA:38:AC:51:8D:C6:7C:D9:59## Optional: TLS options.# tls:## Optional: minimum TLS version to accept. Default 1_2. Valid values are 1_0, 1_1, 1_2, 1_3.# minVersion: 1_2## Optional: maximum TLS version to accept. Default 1_3. Valid values are 1_0, 1_1, 1_2, 1_3.# maxVersion: 1_3## Optional: List of cipher suites to accept. Defaults to all of the valid values. Valid values are listed. # ciphers:# - TLS_RSA_WITH_AES_128_CBC_SHA# - TLS_RSA_WITH_AES_256_CBC_SHA# - TLS_RSA_WITH_AES_128_GCM_SHA256# - TLS_RSA_WITH_AES_256_GCM_SHA384# - TLS_AES_128_GCM_SHA256# - TLS_AES_256_GCM_SHA384# - TLS_CHACHA20_POLY1305_SHA256# - TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA# - TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA# - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA# - TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA# - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256# - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384# - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256# - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384# - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256# - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256sinks:logscale:type:logscale# Replace with your ingest token.token:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX# Replace with the "Ingest URL" on the FLC download page. It must include the "https://" at the beginning.url:https://XXX.YYY.ZZZ# This sets the maximum allowed single event size to 1 MB; larger messages will be truncated#maxEventSize: 1048576
Syslog TLS Source
type is set to
syslog_tls in order to read syslog
data via TLS you must specify the following fields:
PEM certificate file. This file should include a chain of all intermediate certificates up to the trusted root certificate, and the clients then only have to trust the root certificate.
Name of the configured sink that should be sent the collected events.
[a] Optional parameters use their default value unless explicitly set.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.
Client Authentication
To configure client authentication, the following modes are
available:
type is set to
unifiedlog in order to achieve the
best results from Unified log data we recommend using the following
fields and installing the
apple/unifiedlog package.
Specify a dedicated unifiedlog parser. E.g. apple/unifiedlog:unifiedlog-compact. If a parser is assigned to the ingest token being used this parser will be ignored.
[a] Optional parameters use their default value unless explicitly set.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.
The subsystem, process and predicate identifiers may be combined as
demonstrated in the examples.
Predicates
The log out can be filtered in case only a specific type of logs are
required or the data load needs to be limited. The option is
directly forwarded to the built-in log command in the configuration
for each channel.
Pattern Clauses
The filter expression matches one or more of the following pattern
clauses:
eventMessage - specify a text
pattern, or text, within the message, or an activity name.
processImagePath - this
matches the text pattern in the name of the process which
originated the event.
senderImagePath - this
matches the text pattern in the name of the sender, which
might be the name of a library, extension, or executable.
subsystem - this matches the
subsystem specifier, e.g. com.apple.TimeMachine. Although
potentially valuable, subsystems are not yet widely used, and
discovering which is which is not easy. Use with caution.
category - this matches the
category, and should be used in conjunction with the subsystem
filter; for the whole specifier
com.apple.TimeMachine.TMLogInfo,
the subsystem is
com.apple.TimeMachine and the
specifier is TMLogInfo.
eventType - matches the type
of event, such as logEvent (1024), traceEvent (768),
activityCreateEvent (513), or activityTransitionEvent (514).
Can be given as characters (case-sensitive) or digits as shown
in parentheses. Use these only with the operators
== or
!=, as they are treated as
numbers rather than text.
messageType - matches the
type of message for logEvent and traceEvent, and includes
default (0), release (0), info (1), debug (2), error (16), and
fault (17). Can be given as characters (case-sensitive) or
digits as shown in parentheses. Use these only with the
operators
== or
!=, as they are treated as
numbers rather than text.
Operators
The following comparison and other operators are available:
== (or =) for equality
!= or <> for inequality
>= or => for greater than or equal to
<= or =< for less than or equal to
> for greater than
< for less than
AND or && for logical and
OR or || for logical or
NOT or ! for logical not
BEGINSWITH, CONTAINS, ENDSWITH, LIKE, MATCHES for string
comparisons, using regex expressions when desired; strings
can be compared with case insensitivity and diacritic
insensitivity by appending [cd] to the operator, e.g.
CONTAINS[c] means case-insensitive comparison
UTI-CONFORMS-TO, UTI-EQUALS support comparison of UTIs like
com.adobe.pdf
ANY, SOME, NONE, IN, and array operators are available but
unlikely to be used
FALSE, TRUE, NULL have their expected literal meanings.
Exec Example
yaml
sources:cmd_ls:type:cmdcmd:ls# scheduled or streamingmode:scheduledargs:--l--hworkingDir:/foo# Interval between each invocation of the cmdinterval:60# Output mode when using mode 'scheduled'. Either 'streaming' (default) or 'consolidateOutput'.# When outputMode is set to 'consolidateOutput', the entire output of the scheduled command is sent as a single event.# outputMode: consolidateOutput# Environment variables can be configured and passed to the commandenvironment:# define CONFIGURED_ENV1 as environment variableCONFIGURED_ENV1:my_configured_env_1# Pass environment variable: MY_ENV_VAR to commandMY_ENV_VAR:$MY_ENV_VARsink:my_humiocmd_tail:type:cmdcmd:tailmode:streamingargs:--FworkingDir:/foosink:my_humiosinks:my_humio:type:humiotoken:<ingest-token-repo2>oranenvironmentvariableurl:https://cloud.us.humio.comcompression:gzipcompressionLevel:9tls:insecure:false-----BEGINCERTIFICATE-----...-----ENDCERTIFICATE-----caFile:/etc/ssl/cert.pemproxy:nonequeue:fullAction:deleteOldestmemory:flushTimeOutInMillisecond:200maxLimitInMB:1024
Exec Source
If type is set to
cmd you must specify the fields:
Can be set to scheduled to collect data at intervals in which case you must specify the interval or streaming to collect data constantly. To create a single multiline event when running in the schedule mode set the option consolidateOutput to true.
Name of the sink, which you configured in sinks, that should be sent the collected events.
workingDir
string
required
Specifies the directory in which to run the command.
[a] Optional parameters use their default value unless explicitly set.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.
Linux Example
yaml
sources:# Collect local files.var_log:type:fileinclude:/var/log/*exclude:/var/log/*.gzsink:humio# Collect syslog udp 5140.syslog_udp_5140:type:syslogmode:udpport:5140sink:humioworkers:1# Collect syslog tcp 5140.syslog_tcp_5140:type:syslogmode:tcpport:5140sink:humiosinks:humio:type:humio# Replace with your specified ingest token.token:$INGEST_TOKEN# Replace with your "standard endpoint" API URL: https://library.humio.com/endpoints/url:$HUMIO_URL
File Linux Source
This configuration example which uses the file source with specific
values for collecting var logs.
See Configuration Elements for information on
the common elements in the configuration, for example sinks, and their
configuration parameters and details on the structure of the
configuration files.