Measure Data Ingest

A key consideration is that attempts to consider only data in the ingestion phase as counting towards license usage. The raw event data may have additional fields and tags added to it by the client - these are referred to as explicit fields and explicit tags.

Data is then passed to the optional field removal phase, where explicit fields can be removed. Explicit field removal is a feature of the parser, and is described in the Removing Fields documentation. Note that field removal occurs before parsing of the data, and any fields removed at this stage do not then count towards usage. The ingested data is then parsed. Any fields or tags derived by the parser from the raw event string are also not counted towards usage.

CrowdStrike Parsing Standard (CPS) compliant parsers may add additional fields at this stage, but do so after the ingested data is parsed, and so fields added by the CPS-compliant parser do not count towards usage.

Note

NG-SIEM does not currently support the field removal feature in parser settings, only does.

A summary of the ingested data flow is given here:

  1. Data is submitted from the client, with the optional addition of explicit fields and tags.

  2. Data is processed by the optional field removal filter, if configured in the parser settings. ( only)

  3. Parser parses data, and adds necessary fields and tags, if required for storage. For example, if there's no timestamp in the submitted data, then @timestamp is added.

  4. Usage cost calculation is carried out.

  5. A CPS-compliant parser adds required fields.

Note

See the table later in this section for details of possible exceptions in the previous flow.

uses the following formula when calculating the amount of data ingested:

ingestAfterFieldRemovalSize = @rawstring + explicit fields + explicit tags - removed explicit fields

Each of the items in this calculation are described in the following table, along with any exclusions that may apply:

Item Included in usage Exclusions
@rawstring The length of the raw event string. All content of @rawstring counts towards usage. None. Note, data can't be removed from the @rawstring using field removal functionality.
fields Any fields (keys and values) added by the client. For example, fields added when using the structured or HEC endpoints. Fields added by . Fields derived from the @rawstring. Fields added by a parser.
tags Any tags (keys and values) added by the client. For example, tags added when using the structured or HEC endpoints. Any tags added by the parser during parsing. Tags derived from @rawstring.
Removed fields Fields removed using the parser field removal functionality. See Optimize Ingestion for more details. Tags can't be removed using field removal. Fields derived from @rawstring can't be removed using this feature.

The usage cost is determined by the size of the incoming data (@rawstring) plus the fields needed for storage and classification, such as @timestamp and any tags and standard fields not extracted or derived from @rawstring.

Note

Your data needs to have a @timestamp field in order to be searchable in . This field counts as ingest even if it is extract from the @rawstring.

For parsed data, in the majority of cases, there are no added or implied fields that are not derived from the incoming @rawstring. So while may add fields based on that information, it doesn't apply to the usage cost calculation.

To measure the ingest amount, query the humio-usage repository for ingestAfterFieldRemovalSize. See The humio-usage Repository for more details.

To monitor current usage in Organization Settings, see Usage Page.

Data Not Measured

The following do not count towards the usage cost calculation:

  • If you use LogScale Log Collector, metadata (fields) added by the Log Collector. These typically start with @collect.*.

  • Fields that are removed using the field removal feature in parser settings.

  • Fields and tags derived by the parser from the @rawstring.

  • Fields that start with @ are internally generated and so not included in usage costs.

  • Fields added by CPS parsers. Examples include fields such as Cps.version and Parser.version. See CrowdStrike Parsing Standard (CPS) 1.0 for further details.

Examples

The following table describes some scenarios and indicate what data would be included for usage costs, and any exceptions that might apply:

Scenario Usage Exclusions
gathers log entries containing a timestamp, which are parsed by a CPS-compliant parser. There are no explicit fields added by the client, and field removal is not used. The raw log data, @rawstring. Fields added by , and fields derived from the @rawstring. Fields added by the parser are also ignored in this case.
A service submits log data using the Structured endpoint, with the addition of the explicit fields source, and service. The explicit tags #host, and #datacenter are also added. However, service can be extracted from the log data @rawstring, so is to be removed by a setting in the parser field removal configuration. The event timestamp is a required field for this endpoint. The parser used is not CPS compliant. @rawstring + timestamp + #host + #datacenter + source None.
A source sends log data including a timestamp to NG-SIEM with additional explicit fields, source and service. On NG-SIEM there is no field removal option for the parser. The data is parsed by a CPS-compliant parser. @rawstring + source + service Implicit fields added by the CPS-compliant parser. Fields derived from the @rawstring.