Optimize Ingestion

To get the most out of your LogScale SaaS license, remove data you will not need. This prevents it from counting toward your license usage. This example filters data using the Removing Fields functionality in Parser settings.

Important

This can be used to filter information from ingested structured data. The remove fields feature cannot be used to modify the @rawstring, or fields created when parsing the @rawstring. For more information, see Removing Fields.

Screenshot of the LogScale Parser settings interface showing the 'Fields to remove' configuration, with an example of removing the 'kubernetes.image' field from incoming events. This interface demonstrates how administrators can optimize license usage by excluding specific structured data fields from usage calculations, reducing costs without losing essential log information.

Figure 2. Usage Page


The remove fields feature with the name kubernetes.image from the incoming events. The usage calculation will then disregard these fields in the calculation.

Important

If you use the dropEvent() function during parsing to optimize usage, the amount of data saved in LogScale will be reduced.

However, LogScale includes the original data before dropping in the license calculation. Therefore, dropping events will not affect your license usage according to the ingestAfterFieldRemoval metric.

Unless your order form specifies that your volume is calculated as raw (uncompressed) data ingested by the product (segment_save metric), it is recommended that you switch to either dropping events at source or using the Removing Fields functionality.

Best Practices to Optimize Data Ingestion

You can reduce costs by preparing data before ingesting it into LogScale. Define the problem the data should solve. Then prepare the data to reduce the amount you need to ingest.

Here are some suggestions for preparing data:

  • Handle missing or inconsistent data, remove duplicates, and handle outliers. When you inspect your data, you might find entries with missing or inconsistent values. Clean the data by correcting inconsistencies or unrealistic values. Remove missing data that is not necessary.

  • Transform data as needed to enable easier analysis. You might need to create new variables that combine two values into one. You might want to make continuous variables more discrete. You can also convert categorical variables into dummy variables.

  • Reduce the amount of data to make it more manageable. You can ingest only events within a certain time range. You can also focus on specific categories within the dataset.

  • Combine data from different sources to resolve inconsistencies or missing data. This can help reduce costs. Instead of ingesting two datasets where data might overlap, ingest only one dataset.

Several tools exist to help you prepare and analyze data before ingest. Examples include Jupyter Notebook and pandas. Many of these tools are open-source. Keep this in mind depending on the sensitivity of your data.

Test and sample data are crucial when preparing data for ingestion into a log management system. Sample data helps you ensure that parsing the complete dataset will produce the expected events. It confirms that the parser functions as expected. Use createEvents() to generate temporary events for sample data. This function does NOT count against your usage. For examples, see createEvents().