Journal Source

Overview

The Journal Source is a feature of the Falcon LogScale Collector designed specifically for Linux systems using systemd. It enables the direct collection of events from the systemd journal (journald), which is the centralized logging system used by modern Linux distributions.

The Journal Source monitors the systemd journal for new log entries and automatically ingests them into LogScale. The Collector supports filtering by systemd units and boot sessions, and can read from both local and remote journal directories.

How it works

The Journal Source operates by interfacing with the systemd journal API to read log entries. When new journal entries matching the configured filters are detected, the collector automatically ingests them as individual events into LogScale.

Key Features

  • Unit Filtering: Include or exclude logs from specific systemd units

  • Boot Session Filtering: Optionally collect only logs from the current boot

  • Remote Journal Support: Read from journal directories on remote systems or mounted filesystems

  • Structured Data: Journal entries include rich metadata (unit, priority, timestamp, etc.)

Prerequisites

Before configuring the Journal Source, ensure that you have:

  • A Linux system running systemd (most modern distributions)

  • Appropriate permissions to read the systemd journal (typically requires running as root or being in the systemd-journal group)

  • A configured sink (destination) for the collected events

Configuration

Prerequisites

First, define a sink that will receive the collected events:

yaml
sinks:
  logscale_sink:
    type: logscale
    url: "https://cloud.humio.com/"
    token: "${LOGSCALE_TOKEN}"

Example 1: Basic Journal Collection

Collect all journal entries from the local system:

yaml
sources:
  system_journal:
    type: journald
    sink: logscale_sink

Example 2: Specific Units Only

Collect logs from specific systemd units:

yaml
sources:
  specific_services:
    type: journald
    includeUnits:
      - sshd.service
      - nginx.service
      - docker.service
    sink: logscale_sink

Example 3: Exclude Specific Units

Collect all logs except from specific units:

yaml
sources:
  filtered_journal:
    type: journald
    excludeUnits:
      - systemd-modules-load.service
      - systemd-udevd.service
    sink: logscale_sink

Example 4: Current Boot Only

Collect only logs from the current boot session:

yaml
sources:
  current_boot:
    type: journald
    currentBootOnly: true
    sink: logscale_sink

Example 5: Remote Journal Directory

Collect from a specific journal directory (e.g., from a mounted remote system):

yaml
sources:
  remote_journal:
    type: journald
    directory: /mnt/remote/var/log/journal
    sink: logscale_sink

Example 6: Complete Configuration

yaml
sinks:
  logscale_sink:
    type: logscale
    url: "https://cloud.humio.com/"
    token: "${LOGSCALE_TOKEN}"

sources:
  # System services journal
  system_services:
    type: journald
    includeUnits:
      - sshd.service
      - systemd-logind.service
      - cron.service
    currentBootOnly: false
    parser: "linux/systemd-logs:linux/systemd-logs"
    transforms:
      - type: static_fields
        fields:
          log_source: "systemd_journal"
          host: "${HOSTNAME}"
          environment: "${ENV}"
    sink: logscale_sink
  
  # Application-specific journal
  docker_logs:
    type: journald
    includeUnits:
      - docker.service
      - containerd.service
    currentBootOnly: true
    transforms:
      - type: static_fields
        fields:
          log_type: "container_runtime"
    sink: logscale_sink
Unit Filtering

Include Units

When includeUnits is specified, only logs from the listed systemd units are collected:

yaml
includeUnits:
  - sshd.service
  - nginx.service
  - postgresql.service

Exclude Units

When excludeUnits is specified, logs from all units except the listed ones will be collected:

yaml
excludeUnits:
  - systemd-udevd.service
  - systemd-modules-load.service

Note: You cannot use both includeUnits and excludeUnits simultaneously. Choose one filtering approach.

Finding Unit Names

To find systemd unit names on your system:

shell
# List all active units
systemctl list-units

# List all units (including inactive)
systemctl list-units --all

# List only service units
systemctl list-units --type=service

# Show journal entries for a specific unit
journalctl -u sshd.service
Boot Session Filtering

Current Boot Only

Set currentBootOnly: true to collect only logs from the current boot session:

yaml
sources:
  current_boot_logs:
    type: journald
    currentBootOnly: true
    sink: logscale_sink

This is useful for:

  • Reducing data volume by excluding historical logs

  • Focusing on current system state

  • Avoiding duplicate ingestion after system restarts

All Boots

Set currentBootOnly: false (default) to collect logs from all boot sessions:

yaml
sources:
  all_boot_logs:
    type: journald
    currentBootOnly: false
    sink: logscale_sink
Journal Directory

Local Journal

By default, the collector reads from the local system journal. No directory parameter is needed.

yaml
sources:
  local_journal:
    type: journald
    sink: logscale_sink

Custom Journal Directory

Specify a custom journal directory to read from:

yaml
sources:
  custom_journal:
    type: journald
    directory: /var/log/journal
    sink: logscale_sink

This is useful for:

  • Reading from mounted remote filesystems

  • Collecting from archived journal directories

  • Centralized log collection from multiple systems

Event Structure

Journal entries are ingested as structured events with rich metadata. Common fields include:

  • MESSAGE: The log message content

  • PRIORITY: Syslog priority level (0-7)

  • SYSLOG_IDENTIFIER: Program name

  • _SYSTEMD_UNIT: Systemd unit name

  • _PID: Process ID

  • _UID: User ID

  • _GID: Group ID

  • _HOSTNAME: Hostname

  • _BOOT_ID: Boot session identifier

  • _MACHINE_ID: Machine identifier

  • _TRANSPORT: How the log was received (e.g., journal, syslog, kernel)

Additional fields may be present depending on the source of the log entry.

Common Use Cases

System Service Monitoring

Monitor critical system services:

yaml
sources:
  critical_services:
    type: journald
    includeUnits:
      - sshd.service
      - systemd-logind.service
      - firewalld.service
    sink: logscale_sink

Container Runtime Logs

Collect container runtime logs:

yaml
sources:
  container_runtime:
    type: journald
    includeUnits:
      - docker.service
      - containerd.service
      - podman.service
    sink: logscale_sink

Security Auditing

Focus on security-related services:

yaml
sources:
  security_logs:
    type: journald
    includeUnits:
      - sshd.service
      - sudo.service
      - polkit.service
      - auditd.service
    sink: logscale_sink

Database Monitoring

Monitor database services:

yaml
sources:
  database_logs:
    type: journald
    includeUnits:
      - postgresql.service
      - mysql.service
      - mongodb.service
    sink: logscale_sink
Best Practices

Filtering Strategy

  • Use includeUnits when monitoring specific services

  • Use excludeUnits when you want most logs but need to filter out noisy units

  • Start with broad collection and refine filters based on actual needs

  • Monitor collector resource usage with broad filters

Performance Optimization

  • Use unit filtering to reduce data volume

  • Enable currentBootOnly if historical logs aren't needed

  • Monitor journal disk usage and rotation settings

  • Consider journal vacuum operations for large journals

Security Considerations

  • Run the collector with minimum required privileges

  • Be aware that journal entries may contain sensitive information

  • Use appropriate access controls on collected log data

  • Regularly review what units are being collected

Journal Maintenance

  • Monitor journal disk usage: journalctl --disk-usage

  • Configure journal size limits in /etc/systemd/journald.conf

  • Use journalctl --vacuum-size= or --vacuum-time= to clean old logs

  • Ensure journal rotation is properly configured

Monitoring and Troubleshooting

Monitoring Collector Status

Monitor your Journal Source using the following approaches:

  • Check collector logs for connection status and ingestion metrics

  • Monitor log ingestion rates and volumes

  • Track any filtering or parsing errors

  • Set up alerts for collection failures or stalls

Common Issues and Solutions

Issue Symptom Potential Causes and Solutions
No Logs CollectedCollector runs but no logs appear in LogScale
  • Verify the collector has appropriate permissions to read the journal

  • Add the collector user to the systemd-journal group

  • Check that journal entries exist: journalctl -n 100

  • Verify LogScale sink configuration is correct

  • Review collector logs for permission errors

Permission DeniedCollector fails to access journal
  • Run collector as root or add user to systemd-journal group

  • Check journal directory permissions

  • Verify SELinux/AppArmor policies allow journal access

  • Ensure journal files are readable

Missing Expected LogsSome logs don't appear
  • Verify unit names are correct (check with systemctl list-units)

  • Ensure units are actually generating logs

  • Check includeUnits/excludeUnits configuration

  • Verify currentBootOnly setting matches expectations

  • Test with journalctl -u <unit-name>

Duplicate EventsSame events appear multiple times
  • Check for multiple collector instances monitoring the same journal

  • Verify checkpoint storage is working correctly

  • Review collector restart behavior

  • Check for overlapping unit filters

High Resource UsageCollector consumes excessive CPU/memory
  • Reduce number of monitored units

  • Enable currentBootOnly to limit historical data

  • Check journal size and rotation settings

  • Monitor journal read performance

  • Review transform complexity

Journal Directory Not FoundCollector fails to start
  • Verify the specified directory path exists

  • Check directory contains journal files

  • Ensure proper permissions on the directory

  • Validate path syntax in configuration

Old Logs Being IngestedHistorical logs appear unexpectedly
  • Set currentBootOnly: true to limit to current boot

  • Check collector checkpoint state

  • Review journal retention settings

  • Consider using journalctl --vacuum-time= to clean old logs

Testing Journal Collection

Test journal collection using the journalctl command:

shell
# View recent journal entries
journalctl -n 100

# View logs from a specific unit
journalctl -u sshd.service

# View logs from current boot only
journalctl -b

# View logs from a specific boot
journalctl -b -1  # Previous boot

# Follow journal in real-time
journalctl -f

# View logs from a specific directory
journalctl -D /var/log/journal

# Check journal disk usage
journalctl --disk-usage

# Verify journal integrity
journalctl --verify

Managing Journal Size

shell
# View recent journal entries
# Check current journal size
journalctl --disk-usage

# Remove old journal entries (keep last 2 days)
journalctl --vacuum-time=2d

# Limit journal size to 1GB
journalctl --vacuum-size=1G

# Configure persistent limits in /etc/systemd/journald.conf
SystemMaxUse=1G
SystemKeepFree=500M
Configuration Parameters

Table: Journal Source

ParameterTypeRequiredDefault ValueDescription
currentBootOnlybooleanoptional[a] false If true, only collect logs from the current boot.
directorystringoptional[a]   Allows you to specify the journal directory to collect from. If not specified collection is from the local journal.
excludeUnitsarray of stringsoptional[a]   If specified, LogScale will not collect from these units.
includeUnitsarray of stringsoptional[a]   If specified, LogScale only collects from these units.
parserstringoptional[a]   Specify the parser used to parse the logs. If you install the parser through a package you must specify the type and name as displayed on the parser's page. For example linux/system-logs:linux/system-logs. If a parser is assigned to the ingest token, this parser is ignored.
sinkstringrequired   Name of the sink that collected events should be sent to.
transformstransformoptional[a]   Specify transforms to use for this source (optional). See All Sources: How to Use Transforms to learn how to use transforms.
typejournaldrequired   The source type must be set to journald.

[a] Optional parameters use their default value unless explicitly set.