Log Sources

LogScale was developed to take in large volumes of log data; that is, the log files generated by your operating system, applications, servers, services, routers, network equipment, and security devices and allow that information to be queried and information and insights extracted.

These files contain a wealth of information, from basic notifications of activity such as start-up or shutdown of a service, to debug logs and detailed information about faults and failures. This data can provide useful information about how your systems and services are operating.

For example, here's a line from an NGINX HTTP access log:

accesslog - - [28/Feb/2019:13:17:10 +0000] "GET /?p=1 HTTP/2.0" 200 5316 "https://domain1.com/?p=1" "Mozilla/5.0 (Windows NT 6.1)"

This line shows the IP address of the host that accesses the data, the timestamp when the access occurred, the URL and web browser used. Collating this information across multiple lines can provide information on the most frequently accessed pages, or how busy your web site is.

Alternatively, here is a line from the system log, or syslog, on Linux:

Mar  4 05:05:49 shipper systemd-timesyncd[503]: Initial synchronization to time server (ntp.ubuntu.com).

There are common elements in the output, such as the timestamp used to identify when the original log event occurred. Different systems will produce different logs, in different formats. Systems may also produce different logs with different levels of detail, debug logs over info logs for example.

Collating this information on a single server might provide:

  • Identify faults and failures

  • identify security threats

  • Collect metrics on systems and performance

Looking at the log files on a single server is useful, but if you manage hundreds or thousands of servers, finding, identifying and reacting to errors, warnings, and security issues can be complex without LogScale to store, analyze and report on the data. Collecting the data across multiple servers enables you to:

  • Correlate failures across multiple servers and identify the source or trigger of a fault

  • Monitor and identify multi-vector security threats, where an attacker is trying multiple services or potential points of weakness

  • Identify performance bottlenecks, or identify areas to balance and distribute load

For example, identifying a failure in an internal DNS server could help indicate why other services failed, but you need to identify and compare the timestamps of different issues across multiple servers. When tracking security threats, failed login attempts across multiple services, or connections from the same IP address to multiple servers and services may indicate a hacking attempting.

Logs on source servers, routers and endpoints must be sent to LogScale through a suitable API using a Log Shipper such as the Falcon LogScale Collector that will manage reading the logs and sending them to LogScale for processing.

For LogScale to make use of the data in the log files, the data must be ingested.