Scheduled Searches

Security Requirements and Controls

A scheduled search is a static query, set to run on a schedule. At a scheduled interval, the query will run and if its result is non-empty, the scheduled search will trigger its associated actions.

Use Case

Scheduled searches are related to Alerts and they are able to trigger the same actions. However, scheduled searches are applicable in other use cases than alerts, such as when:

  • You need to automatically report some search result on a schedule. For instance, you have stakeholders that expect to get an email every Monday at 10:00 containing the top most important security events for the previous week.

  • You have an ingest delay on some logs, which results in them never appearing in searches made by alerts. For instance, if an alert looks back in time using a 1h time window, it won't trigger on logs ingested with a 12-hour delay. With a scheduled search, you can choose to run your search at a point in time, where you're fairly certain that every log of interest has been ingested.

  • You need to take delayed action on search results. For instance, if you trigger user bans using an alert, offending users will be banned immediately upon a transgression and can then easily figure out what triggered their ban. Using a scheduled search, you can choose to ban all offending users at the same time every day, as to obscure the conditions of a ban.

If your situation doesn't fall into one of these use cases, you should probably use an alert instead. Alerts run as live queries, rather than historic ones, and should thus generally be considered more performant.

Creating a Scheduled Search

  1. Go to the Repository and Views page.

  2. Select a Repository.

  3. Click the Automation tab on the top bar of the User Interface and select Scheduled Searches from the menu on the left: the full list of available Scheduled Searches appears. They can have labels attached to them which are displayed next to the scheduled search name. This can be a useful way to tag the scheduled searches with meaningful data and to help when trying to locate them with a certain tag.

    Scheduled Search from Tab

    Figure 185. Creating Scheduled Search from Tab


  4. Click + New Scheduled Search

  5. The New Scheduled Search form is displayed, click Import from on the top right if you wish to import the scheduled search from:

    • Template, browse for or drag and drop a template based on an existing scheduled search

    • Package, invoke scheduled search templates that are part of a LogScale package

  6. Fill in the information required:

    Setting Scheduled Search Properties

    Figure 186. Setting Scheduled Search Properties


  7. When you're finished setting the properties for the new scheduled search, click Create scheduled search.

Alternatively you can use the GraphQL API to view, create, update and delete scheduled searches using the associated queries and mutations.

Note

Scheduled searches, per default, will not trigger any action(s), if a query result contains a warning. Scheduled searches have the same behavior as alerts in regards to warnings and errors, see Diagnosing Alerts page.

In the following we discuss some of the fields you set on a scheduled search.

Scheduled Search Run on Behalf of

You can run the scheduled search with the permissions of another user; click the Run on behalf offield to get a list of available names to pick from, or directly enter the name of the user you want to transfer the ownership of the schedule search to.

Search Schedule

This field allows you to specify the schedule on which your scheduled search should be run. The schedule is defined using a UNIX cron expression, as known from the crontab file found in many UNIX-like systems. Scheduled searches are not allowed to run more than once an hour. Therefore the minutes field in the cron expression is restricted to only allow values in the range [0-59]. There are many online tools to help you generate UNIX cron expressions, that you can use if you need help writing up an expression for your use case.

Cron Schedule Templates

Instead of providing a fixed minute which a scheduled search should run, a template can be provided by specifying the character H in the UI, which will run the scheduled search at a random, but fixed minute past the hour (based on the hash of the ID) — see figure below:

Cron Schedule with Template

Figure 187. Cron Schedule with Template


The smallest interval that can be configured is every minute, using:

cron
* * * * *

Care should be taken with such small increments as the load on the cluster may be increased.

When using H, the template will pick a number in the range [0-59] for the minute when the search should run, and all consecutive searches will run on the same minute. The search will run at a fixed arbitrary minute past every hour, meaning that if a search is made with the cron expression schedule:

cron
H * * * *

For example, if the minute chosen is 48, the search will run at 00:48, 01:48, 02:48 etc. This is only supported for minutes and only a lone H is supported. The interval cannot be controlled or modified, for example by using notation such as H/15, or H-15.

Note

The use of an self-selected minute for execution by LogScale is designed to spread the load of scheduled search execution.

UTC Offset

The Coordinated Universal Time(UTC) offset defines the temporal offset from UTC in which the search is scheduled. For instance, with a schedule:

cron
0 6 * * *

With an offset UTC+01:00, the search will be scheduled for 5AM at UTC.

Time Interval

As for all searches, a time interval must be specified. For scheduled searches the time interval is given by a start and end time relative to the scheduled execution time. For instance, if a scheduled search is executing at midnight Jan 2nd, with a time interval of start = 24h and end = now, the search will consider all logs within the time interval: [20xx-01-01T00:00- 20xx-01-02T00:00].

Backfill Limit

If LogScale is down or an error prevents an action from being triggered, you will miss searches that would have otherwise been scheduled and executed. When it again becomes possible for schedule searches to run and have them trigger actions, LogScale will attempt to backfill searches, which were missed previously.

The backfilling behavior depends on the value given to the backfill limit, which determines how many missed searches will be executed before any new searches are scheduled.

For example, if we schedule a search every hour:

cron
0 * * * *

If LogScale is down between 10:30 and 14:15. This means that the searches at 11:00, 12:00, 13:00 and 14:00 would be missed.

Executing the most recent 'missed' search is not considered backfilling, as this can also occur under normal operation, if there is a slight delay within LogScale. Thus, if the backfill limit is set to 0, as per default, only the search at 14:00 will be executed at startup.

If we increase the limit to 1 we would start off by executing the search scheduled at 13:00, if we increase the limit to 2 we start with the search at 12:00 and if we increase the limit to 3 we start with the search at 11:00. Increasing the value of the backfill limit beyond this point will not have any effect in this example. Missed searches are executed in sequence from oldest to newest.

The backfill limit may not exceed the global maximum backfill limit.

Spacing Out Searches

LogScale will always attempt to run a search exactly according to schedule. This makes scheduled searches predictable, but also risks that many scheduled searches will be configured to run at the same time, which might cause delays.

For example, it is common to schedule many jobs for midnight, if they are to be run daily, but if you experience delays in search execution because of a sudden high search load, try to space the searches out over a larger span of time.

One way to automate this, rather than manually tracking individual requests is to use the H notification. Using H as the cron specification, allows LogScale to automatically choose the number of minutes past each hour that a scheduled search will be run.

Using this notation can be an effective way spreading out the execution of scheduled searches without having to manually monitor and track each scheduled search. The notation also makes it easy to copy and duplicate scheduled searches.

This method is only appropriate if you do not need to explicitly control the search window. Because the H automatically chooses a minute, the search window will be determined by the chose figure. For example, if you use the H notation run the search from 24 hours ago to now and the search runs at 0:48, then the search window will be from 0:48 yesterday until 0:48 today.

If you decide to run a search on another schedule, but wish to keep the same search window, you need to update start and end on your scheduled search. For instance, if your search was running at midnight and searching through the previous day, you would have configured the interval parameters as start=24h and end=now. But if you need to reschedule this search run at 3AM instead, you would have to update the interval parameters as start=27h and end=3h to search within the same 24 hour time window.