Sources and Examples

The following sections details the specific configurations for each sources type along with example configuration files additionally you can find a description of the fields below each example .

File

yaml
fleetManagement:
  token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
  ## Change the URL if needed to reflect your LogScale URL.
  url: https://cloud.us.humio.com

dataDirectory: data
sources:
  apache_logs:
    type: file
    # Glob patterns
    include: /var/log/apache/.log
    exclude: /var/log/apache/not_me.log
    sink: my_humio_instance
    parser: accesslog
    multiLineBeginsWith: ^20\d{2}-
    transforms:
      # static_fields transform adds configured key, value pairs as fields
      - type: static_fields
        fields:
          mykey: myvalue
          # Passing environment variables is supported
          myenvvar: $MY_ENV_VAR
sinks:
  my_humio_instance:
    type: humio
    token: <ingest-token-repo2> or an environment variable
    url: https://cloud.us.humio.com
    compression: gzip
    compressionLevel: 9
    tls:
      insecure: false

        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
      caFile: /etc/ssl/cert.pem

    proxy:none

    queue:
      fullAction: deleteOldest
      memory:
        flushTimeOutInMillisecond: 200
        maxLimitInMB: 1024

File Source

When type is set to file the following configurations apply:

Table: File Source

ParameterTypeRequiredDefaultDescription
inactivity-timeoutinteger 60Specify the period of inactivity for a file being monitored before the file descriptor is closed to release system resource. Whenever the file changes, it is re-opened and the timeout restarted.
includestring  Specifies the max size of batch (default 16MB) and works along with the maximum events per request. The limits are propagated to the queue and replace the maxEventsPerRequest. The limits are also propagated to all the sources that reference the sink.
multiLineBeginsWithregex  

The file input can join consecutive lines together to create multiline events by using a regular expression. It can be configured to use a pattern to look for the beginning or the continuation of multiline events.

Example all multiline events beginning with a date, e.g. 2022 you would use:

yaml
multiLineBeginsWith: ^20\d{2}-

in this case every line that doesn't match the pattern, gets appended to the latest line that did.

multiLineContinuesWithregex  

The file input can join consecutive lines together to create multiline events by using a regular expression. It can be configured to use a pattern to look for the beginning or the continuation of multiline events. Lines that start with whitespace are continuations of the previous line. For example, to concatenate lines indented by whitespace (instead of starting at column 0):

yaml
multiLineContinuesWith: ^\s+

In this case every line that matches the pattern, gets appended to the latest line that didn't.

parserstring  Specify the parser within LogScale to use to parse the logs, if you install the parser through a package you must specify the type and name as displayed on the parsers page for example linux/system-logs:linux-filebeat.
sinkregex  Name of the configured sink that should be sent the collected events
transformsstring  Specify transforms to use for this source (optional), if static_field is specified you must specify a key and a value which can be an environment variable for example myenvvar:$MY_ENV_VAR

See Logscale Collector Configuration Elements for information on the common elements in the configuration file.

Syslog

yaml
fleetManagement:
  token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
  ## Change the URL if needed to reflect your LogScale URL.
  url: https://cloud.us.humio.com

dataDirectory: data
sources:
  syslog:
    type: syslog
    # Mode must be 'udp' or 'tcp'
    mode: udp
    # Port number to listen on
    # Default: 514
    port: 514
    # Optional bind address.
    # If unspecified the source will listen on all interfaces
    # Don't specify port here. Use 'port' field for that
    bind: 0.0.0.0
    # Maximum syslog event size.
    maxEventSize: 2048
    # Size of receive buffer. Default: 64 * maxEventSize.
    receiveBufferSize: 131072
    sink: my_other_humio_instance

sinks:
  my_other_humio_instance:
    type: humio
    token: <ingest-token_repo1>
    url: https://cloud.us.humio.com

      -----BEGIN CERTIFICATE-----
      ...
      -----END CERTIFICATE-----
    caFile: /etc/ssl/cert.pem

    proxy: none

    queue:
      fullAction: deleteOldest
      memory:
        flushTimeOutInMillisecond: 200
        maxLimitInMB: 1024

Syslog Source

If type is set to syslog you must specify the port, address and mode fields.

  • port

    Specify the number of the port on which to listen. The default is 524.

  • address

    Specify the address to bind to. This defaults to all addresses.

  • mode

    Specify the protocol to listen to, which can be tcp or udp.

  • maxEventSize

    Maximum syslog event size, if MaxEventSize is also defined at sinks level

  • receiveBufferSize

    Size of receive buffer. Default: 64 times maxEventSize.

See Logscale Collector Configuration Elements for information on the common elements in the configuration file.

Windows Event Log Example

yaml
fleetManagement:
  token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
  ## Change the URL if needed to reflect your LogScale URL.
  url: https://cloud.us.humio.com
  ## Keep this option as "none" unless you actually need a proxy.
  proxy: none
  ## The TLS option can be uncommented if you're using a self-signed certificate. 
  #tls:
    #insecure: true
 
dataDirectory: C:\ProgramData\CrowdStrike\Humio Log Collector\
 
sources:
  windows_events:
    type: wineventlog
    ## Add other channels by simple adding additional "name" lines.
    ## The following command can be used to find other channels:
    ## Get-WinEvent -ListLog * -EA silentlycontinue | sort-object -Property Recordcount -desc
    channels:
      - name: Application
        excludeEventIDs: [ 11 ]
      - name: Security
      - name: System
      - name: Windows PowerShell
    ## You can manually specify a parser to be used here.
    ## This overrides the parser specified in the LogScale UI.
    #parser: myparser
    includeXML: false
    sink: humio
     
sinks:
  humio:
    type: humio
    token: 2eXXXXXX-81d1-XXXX-bc22-05e430XXXXXX
    ## Change the URL if needed to reflect your LogScale URL.    
    url: https://cloud.us.humio.com
    ## Keep this option as "none" unless you actually need a proxy, this must be set to none if fleet Management is enabled.
    proxy: none
    ## The TLS option can be uncommented if you're using a self-signed certificate. 
    #tls:
      #insecure: true
    ## This increases the maximum single event size to 8 MB. You can change as needed.
    maxEventSize: 8388608
    ## Uncomment if you would like to force a specific level of gzip compression. 9 is the highest.
    #maxBatchSize: 16777216
    #compression: gzip
    #compressionLevel: 9

Windows Event Log Source

If type is set to wineventlog you must specify the channel.

  • channel Specify the windows event log channels to read, if no channel is specified the log collector will subscribe to all available channels. You can also specify IDs using onlyEventIDs or exclude specific event IDs using excludeEventIDs.

    Important

    Subscribing to all channels may impact performance as the amount of data logged would be very high.

    yaml
    channels:
    - <Channel Name>
    - ...
  • includeXML set to false to exlude XML files from the source.

  • providers specify an array of provider names to filter events by provider.

  • parser Specify the parser to use to parse the logs, if you install the parser through a package you must specify the type and name as displayed on the parsers page for example linux/system-logs:linux-filebeat, see Parsers for more information.

See Logscale Collector Configuration Elements for information on the common elements in the configuration file.

Important

Override proxy configuration for the sink. Must be set to none for Windows Server and if fleet management is enabled.

Journal

yaml
fleetManagement:
  token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
  ## Change the URL if needed to reflect your LogScale URL.
  url: https://cloud.us.humio.com

dataDirectory: data
sources:
  journal:
    # Example for reading journald log data (linux only)
    type: journald
    sink: my_humio
    # Optional. If not specified collect from the local journal
    directory: /var/log/journal
    # If specified only collect from these units
    includeUnits:
      - systemd-modules-load.service
    # If specified collect from all units except these
    excludeUnits:
      - systemd-modules-load.service
    # Default: false. If true only collect logs from the current boot
    currentBootOnly: false
sinks:
    my_humio:
      type: humio
      token: <ingest-token-repo2> or an environment variable
      url: https://cloud.us.humio.com
      compression: gzip
      compressionLevel: 9
      tls:
        insecure: false
  
          -----BEGIN CERTIFICATE-----
          ...
          -----END CERTIFICATE-----
        caFile: /etc/ssl/cert.pem
  
      proxy: none
  
      queue:
        fullAction: deleteOldest
        memory:
          flushTimeOutInMillisecond: 200
          maxLimitInMB: 1024

Journal Source

type is set to Journald in order to read JournalD log data (linux only) you must specify the following fields:

  • directory

    Allows you to specify the journal directory to collect from, if not specified collects from the local journal.

  • includeUnits

    If specified only collect from these units

  • excludeUnits

    If specified collect from all units except these.

  • currentBootOnly

    Set to false by default. If true only collect logs from the current boot.

See Logscale Collector Configuration Elements for information on the common elements in the configuration file.

Exec Example

yaml
fleetManagement:
  token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
  ## Change the URL if needed to reflect your LogScale URL.
  url: https://cloud.us.humio.com
dataDirectory: data
sources:
   cmd_ls:
     type: cmd
     cmd: ls
     # scheduled or streaming
     mode: scheduled
     args:
       - -l
       - -h
     workingDir: /foo
     # Interval between each invocation of the cmd
     interval: 60
     
    # Output mode when using mode 'scheduled'. Either 'streaming' (default) or 'consolidateOutput'.
    # When outputMode is set to 'consolidateOutput', the entire output of the scheduled command is sent as a single event.
    # outputMode: consolidateOutput


     # Environment variables can be configured and passed to the command
     environment:
       # define CONFIGURED_ENV1 as environment variable
       CONFIGURED_ENV1: my_configured_env_1
       # Pass environment variable: MY_ENV_VAR to command
       MY_ENV_VAR: $MY_ENV_VAR
     sink: my_humio

   cmd_tail:
     type: cmd
     cmd: tail
     mode: streaming
     args:
       - -F
     workingDir: /foo
     sink: my_humio

sinks:
  my_humio:
    type: humio
    token: <ingest-token-repo2> or an environment variable
    url: https://cloud.us.humio.com
    compression: gzip
    compressionLevel: 9
    tls:
      insecure: false

        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
      caFile: /etc/ssl/cert.pem

    proxy: none

    queue:
      fullAction: deleteOldest
      memory:
        flushTimeOutInMillisecond: 200
        maxLimitInMB: 1024

Exec Source

If type is set to cmd you must specify the fields:

  • cmd

    Specify the command to run.

  • mode

    Can be set to scheduled to collect data at intervals in which case you must specify the interval or streaming to collect data constantly. To create a single multiline event when running in the schedule mode set the option consolidateOutput to true.

  • args

    The arguments of the command.

  • workingDir

    Specifies the directory in which to run the command.

  • interval

    Specifies how frequently the command should be invoked when set to scheduled.

  • environment

    Specify the Environment variables and pass them command to commands using this section.

  • sink

    Set to humio.

See Logscale Collector Configuration Elements for information on the common elements in the configuration file.

Linux Example

yaml
fleetManagement:
  token: b2XXXXXX-fd23-XXXX-98e9-1890e6XXXXXX
  ## Change the URL if needed to reflect your LogScale URL.
  url: https://cloud.us.humio.com
dataDirectory: /var/lib/humio-log-collector
sources:
  # Collect local files.
  var_log:
  type: file
  include: /var/log/*
  exclude: /var/log/*.gz
  sink: humio

  # Collect syslog udp 5140.
  syslog_udp_5140:
  type: syslog
  mode: udp
  port: 5140
  sink: humio

  # Collect syslog tcp 5140.
  syslog_tcp_5140:
  type: syslog
  mode: tcp
  port: 5140
  sink: humio

sinks:
  humio:
    type: humio
      # Replace with your specified ingest token.
    token: $INGEST_TOKEN
      # Replace with your "standard endpoint" API URL: https://library.humio.com/endpoints/
    url: $HUMIO_URL

File Linux Source

This configuration example which uses the file source with specific values for collecting var logs.

See Logscale Collector Configuration Elements for information on the common elements in the configuration file.