| Option | AUTOSHARDING_MAX | |
| Description | Controls the maximum number of data sources (shards) that LogScale's auto-sharding mechanism can create to distribute data across the cluster. This setting directly impacts query performance, memory usage, and data distribution efficiency. | |
Auto-sharding automatically partitions incoming data based on volume and access patterns to optimize query performance and resource utilization. Higher values allow for better parallelization but increase memory overhead.
AUTOSHARDING_MAX controls how many different data
sources are affected by auto-sharding. For more information, see
Configure Auto-Sharding for High-Volume Data Sources.
The benefits of higher values include:
Better query parallelization and performance
More efficient data distribution across nodes
Improved handling of high-volume data streams
Better load balancing during query execution
But the costs of higher values can be:
Increased memory usage (approximately 1-2 MB per active shard)
Higher cluster startup and recovery times
More complex data management overhead
Potential for increased storage fragmentation
When setting the value, reserve 1-2 MB of RAM per 1,000 potential shards. Consider available memory when setting maximum values and be sure to account for memory growth as data volume increases. It is best to start with the default value and adjust based on performance observations.
Signs you need to increase the value can be query performance degradation with high data volumes, uneven data distribution across cluster nodes, high CPU usage during query execution, and ingestion bottlenecks during peak periods. Signs you need to decrease the value can be out of memory errors and startup failures.
When increasing the value, increase gradually (2x increments) rather than making large jumps.
Some metrics to monitor when observing performance for this
value are
time-livequery,
event-latency, and
jvm-heap-usage among
others. You can also
monitor the system
repositories for log messages about shard creation and
allocation, memory pressure, and query performance degradation.