Best Practice: Estimating Local Disk Threshold

You can estimate the local disk threshold value by running two queries against the LogScale repository.

The following two queries calculate what percentage of users' queries look for data in the 30 days, 60 days, 90 days or beyond time frame. This can help users estimate what their LogScale query usage is and helps set a realistic threshold for better disaster recovery efforts.

Local disk storage threshold should be carefully calculated in order to not compromise the speed of the queries. If a large portion of the queries are longer term queries, customers may not be able to change the threshold to a smaller value. Default is 95%.

logscale
"creating new query"
| Relative
| top([start],percent=true,limit=20)
| sort(_count)

We can break this down to:

  • LogScale reports new query events using the above text within the LogScale event log.

    logscale
    "creating new query"
  • Search for Relative events

    logscale
    | Relative
  • Convert the output to select the Top 20 items, organised by percentages

    logscale
    | top([start],percent=true,limit=20)

    For more information: top()

  • Sort the output by the count of items

    logscale
    | sort(_count)

    For more information see sort()

The following query, which needs to be run in the LogScale repository, shows the distribution of searches based on fixed time buckets – now, 30 days, 60 days ago.

logscale
"creating new query"
| Instant
| /start=Instant\((?<startTime>\d+)\)/
| /end=Instant\((?<endTime>\d+)\)/
| time:monthName(startTime,as="startMonth")
| time:monthName(endTime,as="endMonth")
| timeDiff := endTime - startTime
| timeDiffMinute := timeDiff/1000/60
| now()
| 30dTime := _now - 2592000000
| 60dTime := _now - (2592000000*2)
| case {
    test(startTime > 30dTime)
| timeGroup := "last30d";
    test(startTime>60dTime)
| test(startTime <= 30dTime)
| timeGroup:= "30dto60d";
    test(startTime <= 60dTime)
| timeGroup := "60dplus";
   *;
}
| top(timeGroup,percent=true)

The query assembles data, creates time periods for the query, outputting the results collated by these groups. The query can be broken down as follows:

  • Search for the events from the log:

    logscale
    "creating new query"
  • Filter the events by the Instant entries

    logscale
    | Instant
  • Extract the time from the event using a regular expression to use as the startTime

    logscale
    | /start=Instant\((?<startTime>\d+)\)/

    See Regular Expression-based Field Extraction for more information for extracting fields using regular expressions.

  • Extract the time and create the endTime variable

    logscale
    | /end=Instant\((?<endTime>\d+)\)/
  • Extract the month name to form the startMonth

    logscale
    | time:monthName(startTime,as="startMonth")

    See time:month()

  • Extract the month name to form the endMonth

    logscale
    | time:monthName(endTime,as="endMonth")
  • Determine the difference between the start and the end time for each event group

    logscale
    | timeDiff := endTime - startTime
  • Calculate the difference in minutes. Times are in milliseconds, so the value need to be divided by 1000 to get seconds, and then 60 to get minutes

    logscale
    | timeDiffMinute := timeDiff/1000/60
  • Get the current time; this will create the new field _now:

    logscale
    | now()

    See now().

  • Calculate the time 30 days ago by taking 30 days (30 days x 24 hours x 60 minutes x 60 seconds x 1000 milliseconds) to create 30dTime

    logscale
    | 30dTime := _now - 2592000000
  • Calculate the time 60 days ago by taking 60 days (2 * 30 days x 24 hours x 60 minutes x 60 seconds x 1000 milliseconds) to create 30dTime

    logscale
    | 60dTime := _now - (2592000000*2)
  • Now filter events creating time groups for each range match the time specifications, creating a new field timeGroup using the time ranges that have been created:

    logscale
    | case {
        test(startTime > 30dTime)
    | timeGroup := "last30d";
        test(startTime>60dTime)
    | test(startTime <= 30dTime)
    | timeGroup:= "30dto60d";
        test(startTime <= 60dTime)
    | timeGroup := "60dplus";
       *;
    }
  • Aggregate the event data as percentages using the time group:

    logscale
    | top(timeGroup,percent=true)