Adjust Polling Nodes Per Feed
Learn how to configure and adjust FDR feed polling across LogScale nodes, including the ability to customize the number of polling nodes per feed from the default of 5 nodes and manage parallel file downloads from SQS messages. This guide provides detailed GraphQL mutations and queries for setting global polling limits, overriding settings for specific feeds, and controlling file download parallelism to optimize ingestion performance.
Each FDR feed will only be polled with a subset of all LogScale nodes, where FDR polling is enabled. To enable features, see Enabling & Disabling Feature Flags.
This is done to avoid having every node polling every registered feed at the same time, which is not required in most cases.
By default, a feed will only be polled with
5 nodes.
If you want to increase this limit, you can set it to a higher value using
the GraphQL mutation below, which increases the number of polling nodes
per feed to 10:
mutation {
setDynamicConfig(input: { config: FdrMaxNodesPerFeed, value: "10" })
}
In the case that you need to poll every feed with every node, you can set
FdrMaxNodesPerFeed to equal the number of polling
nodes in your cluster.
Setting the value higher than this is allowed, but will not have any additional effect.
You can look up the currently configured value of
FdrMaxNodesPerFeed by running the following GraphQL
query:
query {
dynamicConfig(dynamicConfig: FdrMaxNodesPerFeed)
}
If you need some feeds to have more or less polling nodes than
FdrMaxNodesPerFeed, you can override
FdrMaxNodesPerFeed by setting the number of nodes
needed for a specific feed.
This can be done with the GraphQL mutation shown below:
mutation {
updateFdrFeedControl(
input: {
repositoryName: "REPO_NAME"
id: "FEED_ID"
maxNodes: 10
}
) {
id
maxNodes
}
}
Where repositoryName is the name of the repository,
and FEED_ID is the identifier of
the FDR feed.
To find the number of polling nodes configured for a feed, you can use the following GraphQL query:
query {
repository(name: "REPO_NAME") {
fdrFeedControl(id: "FEED_ID") {
id
maxNodes
}
}
}
In case you need to download multiple files from the same SQS message in
parallel, you can use the fileDownloadParallelism
setting, which denotes how many files are downloaded on a single node.
If maxNodes is
2 and
fileDownloadParallelism is
2, then the total amount of files
across the entire cluster downloaded in parallel is
4 (2 * 2), given that there are 2
SQS messages with at least 2 files each.
This setting must be configured on a per-feed basis: if a feed ingests
slow because of SQS messages containing too many files,
fileDownloadParallelism should be increased.
The default value for the fileDownloadParallelism
setting is 1 and can be changed via
the
default-file-download-parallelism-for-fdr-feed
dynamic configuration.
The UpdateFdrFeedControl()
GraphQL mutation is used to set such parallelism value and it takes the
UpdateLong object, which makes it
possible to distinguish between unsetting the value and keeping the
current value, for either maxNodes and
fileDownloadParallelism.
The example below resets maxNodes to the default
and writes 2 to
fileDownloadParallelism:
mutation {
updateFdrFeedControl(
input: {
repositoryName: "REPO_NAME"
id: "FEED_ID"
maxNodes: { value: null }
fileDownloadParallelism: { value: 2 }
}
) {
id
maxNodes
fileDownloadParallelism
}
}
The example below keeps the current value of
maxNodes and writes
2 to
fileDownloadParallelism:
mutation {
updateFdrFeedControl(
input: {
repositoryName: "REPO_NAME"
id: "FEED_ID"
maxNodes: null
fileDownloadParallelism: { value: 2 }
}
) {
id
maxNodes
fileDownloadParallelism
}
}