The humio-usage Repository
humio-usage contains aggregated data based on measurements, used by the Usage Page in the user interface. This repository is available directly to self-hosted customers. Cloud customers use the humio-organization-usage view. The logs in this repository are the results of an hourly query to the humio-measurements repository. It differs from the humio-measurements repository in the following: it has unlimited retention, data is being logged once every hour, and it does not include data on ingestion source. Moreover, the usage measurements are provided as fields in the log.
You can filter the logs by which repository they come from by using the repo field in the humio-usage repository.
The table below contains some of the more interesting fields a log line could have:
| Field | Example Value | Explanation |
|---|---|---|
| #repo | humio-usage | Repository the usage related logs come from. |
| #sampleRate | hour | To which period the values in this log pertain to. 1 hour in most cases. |
| #sampleType | usageTag | If this log line refers to a repository, or a set of repositories that are grouped under the same usageTag. The value can be one of the following: organization, usageTag or repository. |
| @id | ...Kz_467_7_1759133983... | ID of the event |
| averageDailyIngestAfterFieldRemovalSizeOver30Days | 11630390754468 | Average daily data volume ingested after field removal processing over the past 30 days |
| averageDailyDataScannedOver30Days | 233157158845848 | Average amount of data scanned daily when running queries over the past 30 days |
| averageDailySegmentWriteBytesOver30Days | 6413599076174 | Average daily volume of data written to segments over the past 30 days for the log line. May differ from raw ingestion volumes due to compression, field removal, and other storage optimizations. |
| cid | 32-character (hex) customer identifier number | |
| collectorFieldsSize | 54182037 | Size of fields data collected by LogScale collectors before being sent to LogScale. Represents the volume of metadata and field information gathered at the collection point. Can help users understand how much of their data volume is attributed to fields/metadata versus raw log content. Excessive field size might indicate opportunities to optimize collection configurations by removing unnecessary fields at the source, which could improve overall system efficiency and reduce costs. Related to the size of the Log Collector ingested metadata fields which all are not part of the averageDailyDataScannedOver30Days. |
| contractedDailyIngestBase10 | 9999999000000000 | Represents the organization's contracted daily ingest limit, converted to bytes using base 10. This is the value that should be used when comparing the customer's actual daily ingest to their contracted limit. |
| contractedRetention | 9223372036854775807 | Agreed-upon data retention period specified in a LogScale contract or service agreement. Retention might be different on certain repositories. |
| contractedUsers | 999999 | Maximum number of user licenses included in the LogScale contract or service agreement |
| dataScanned | 123546 | The amount of data that was scanned in the last hour by the customer in #sampleType. |
| falconCollectorFieldsSize | 54182037 | Size of fields data specifically collected by Falcon sensors/collectors before being sent to LogScale. Distinct from general collectorFieldsSize as it focuses only on Falcon-sourced data. This metric can help identify opportunities to optimize Falcon data collection by adjusting field collection settings if the metadata volume is unexpectedly high. Related to the size of the Log Collector ingested metadata fields which are not part of the falconAverageDailyDataScannedOver30Days. |
| falconIngestAfterFieldRemovalSize | 12311214 | Volume of data ingested specifically from Falcon sensors after field removal processing has been applied. Distinct from general ingest metrics as it focuses only on CrowdStrike Falcon-sourced data. By comparing this to raw Falcon ingest volumes, teams can assess the effectiveness of their field removal strategies specifically for security telemetry data. |
| falconIngestBytes | 23123 | Total raw volume of data (unprocessed data size before any field removal or compression) ingested from CrowdStrike Falcon sensors/endpoints. Measures specifically Falcon-sourced data. Security teams can use this to track trends in endpoint telemetry volume, identify unusual spikes in security data, and ensure they have adequate capacity for their security monitoring needs. When compared with falconIngestAfterFieldRemovalSize, it helps quantify the effectiveness of data optimization strategies specifically for security telemetry. |
| falconProcessedEventsSize | 612352905 | Amount of Falcon event data after initial processing but before storage optimization. Represents the size of Falcon telemetry after parsing, normalization, and enrichment, but before compression and field removal. This metric helps security teams understand how their Falcon data changes during the processing pipeline. By comparing falconIngestBytes (raw data coming in), falconProcessedEventsSize (data after processing), falconIngestAfterFieldRemovalSize (data after field removal), and falconStorageSize (final stored data), teams can identify where in the pipeline data volume changes significantly and optimize accordingly. For example, if processed events are much larger than raw ingest, it might indicate excessive enrichment or parsing that could be streamlined. This metric is particularly valuable for tuning the performance and efficiency of security data processing. |
| falconRemovedFieldsSize | 48493 | Total volume of field data that has been removed from CrowdStrike Falcon telemetry during ingestion. Shows how much data you've prevented from being stored by dropping unnecessary fields. |
| falconSegmentWriteBytes | 12313214 | Amount of data in bytes written to storage segments specifically for CrowdStrike Falcon telemetry. |
| falconStorageSize | 129071068836 | Total storage space used to store data from CrowdStrike Falcon sensors/endpoints. |
| ignoreContract | false | Boolean flag indicating whether the contract was ignored |
| ingestAfterFieldRemovalSize | 12311214 | Measures the size of log data after specific fields have been removed during the ingestion process. This is the LogScale data ingest volume as it counts towards license usage. The value is calculated as ingestAfterFieldRemovalSize = processedEvent - removedFields. For detailed information, see Measure Data Ingest. |
| ingestBytes | 23123 | The amount of data that was ingested to this #sampleType in the last #sampleRate, measured in bytes. |
| logId | 17591123 | The ID that binds the logs with different #sampleType together. See LogId in LogScale Usage Repository. |
| measurementPoint | ProcessedEventsSize | The measurement type used for the event in the log line. |
| measurementsQueryUrl | https://your-logscale-instance.com/humio-measurements/search?query=orgId%20%3D%20SINGLE_ORGANIZATION_ID%0A%7C%20dataspaceId%20%3D%20YOUR_DATASPACE_ID%0A%7C%20groupBy%28%22%23measurement%22%2C%20function%3Dsum%28byteCount%29%29&live=false&start=1234567890123&end=1234567890456 | The query used in the humio-measurements repo to generate this log line. |
| missingContract | false | Boolean flag indicating whether a valid contract is associated with LogScale. False indicates the contract exists. True indicates the contract is missing. |
| orgId | The organization ID that the measurements in this log line pertain to. | |
| orgName | MyOrgName | The organization name that the measurements in this log line pertain to, if #sampleType is organization. |
| processedEventsSize | 612352905 | Amount of event data after initial processing but before storage optimization. Represents the size after parsing, normalization, and enrichment, but before compression and field removal. This metric helps security teams understand how their data changes during the processing pipeline. By comparing ingestBytes (raw data coming in), processedEventsSize (data after processing), ingestAfterFieldRemovalSize (data after field removal), and storageSize (final stored data), teams can identify where in the pipeline data volume changes significantly and optimize accordingly. For example, if processed events are much larger than raw ingest, it might indicate excessive enrichment or parsing that could be streamlined. This metric is particularly valuable for tuning the performance and efficiency of data processing. |
| queryEnd | 2021-06-28T07:31:23.044Z | The time window end of querying the humio-measurements repository. |
| queryStart | 2021-06-28T07:31:23.044Z | The time window beginning of querying the humio-measurements repository. |
| removedFieldSize | 48493 | Size of fields removed in the log line. |
| repo | your_repo_name | The repository name that the measurements in this log line pertain to, if #sampleType is repository. |
| repoId | iGLmYFfpyRA9Tq1RxYWc8IF9 | The repository ID that the measurements in this log line pertain to, if #sampleType is repository. |
| segmentWriteBytes | 12313214 | The amount of data in bytes written to the disk in the last hour. |
| storageSize | 129071068836 | Total disk usage in the #sampleType. |
| subscription | Paying | Indicates which LogScale subscription type the log line data belongs to |
| userCount | 48 | Total number of user accounts currently configured |
LogId in LogScale Usage Repository
The logs with different #sampleTypes
share one value, which is the logId.
| #sampleType | processedEventsSize | logId |
|---|---|---|
| repository | 2909 | 2 |
| repository | 1290 | 2 |
| repository | 879 | 2 |
| organization | 5078 | 2 |
By tracing the logId, it is possible drill down into usage, and find out what the usage was in a specific time period, down to an hour, by repository. Since there is unlimited retention on this repository, it is always possible to see usage from the beginning of your usage of LogScale.