[7.17.1] Adjust indices.recovery.max_bytes_per_sec according to external settings#83413
Merged
elasticsearchmachine merged 8 commits intoelastic:7.17from Feb 9, 2022
Conversation
Today the setting indices.recovery.max_bytes_per_sec defaults to different
values depending on the node roles, the JVM version and the system total
memory that can be detected.
The current logic to set the default value can be summarized as:
40 MB for non-data nodes
40 MB for data nodes that runs on a JVM version < 14
40 MB for data nodes that have one of the data_hot, data_warm, data_content or data roles
Nodes with only data_cold and/or data_frozen roles as data roles have a
default value that depends of the available memory:
with ≤ 4 GB of available memory, the default is 40 MB
with more than 4 GB and less or equal to 8 GB, the default is 60 MB
with more than 8 GB and less or equal to 16 GB, the default is 90 MB
with more than 16 GB and less or equal to 32 GB, the default is 125 MB
and above 32 GB, the default is 250 MB
While those defaults served us well, we want to evaluate if we can define
more appropriate defaults if Elasticsearch were to know better the limits
(or properties) of the hardware it is running on - something that Elasticsearch
cannot extract by itself but can derive from settings that are provided at startup.
This pull request introduces the following new node settings:
node.bandwidth.recovery.network
node.bandwidth.recovery.disk.read
node.bandwidth.recovery.disk.write
Those settings are not dynamic and must be set before the node starts.
When they are set Elasticsearch detects the minimum available bandwidth
among the network, disk read and disk write available bandwidths and computes
a maximum bytes per seconds limit that will be a fraction of the min. available
bandwidth. By default 40% of the min. bandwidth is used but that can be
dynamically configured by an operator
(using the node.bandwidth.recovery.operator.factor setting) or by the user
directly (using a different setting node.bandwidth.recovery.factor).
The limit computed from available bandwidths is then compared to pre existing
limitations like the one set through the indices.recovery.max_bytes_per_sec setting
or the one that is computed by Elasticsearch from the node's physical memory
on dedicated cold/frozen nodes. Elasticsearch will try to use the highest possible
limit among those values, while not exceeding an overcommit ratio that is also
defined through a node setting
(see node.bandwidth.recovery.operator.factor.max_overcommit).
This overcommit ratio is here to prevent the rate limit to be set to a value that is
greater than 100 times (by default) the minimum available bandwidth.
Backport of elastic#82819 for 7.17.1
…tic#83350) The setting node.bandwidth.recovery.operator.factor.max_overcommit wasn't added to the list of cluster settings and to the list of settings to consume for updates. Relates elastic#82819
Collaborator
|
Pinging @elastic/es-distributed (Team:Distributed) |
henningandersen
approved these changes
Feb 4, 2022
docs/changelog/82819.yaml
Outdated
| @@ -0,0 +1,6 @@ | |||
| pr: 82819 | |||
| summary: "[Draft] Adjust `indices.recovery.max_bytes_per_sec` according to external\ | |||
Contributor
There was a problem hiding this comment.
Can we remove [Draft] here (and preferably in 8.1 too)?
Member
Author
There was a problem hiding this comment.
I opened #83527 for 8.1, thanks for catching this
docs/changelog/83350.yaml
Outdated
| pr: 83350 | ||
| summary: Add missing max overcommit factor to list of (dynamic) settings | ||
| area: Recovery | ||
| type: bug |
Contributor
There was a problem hiding this comment.
Perhaps we can remove this changelog entry?
Member
Author
|
Thanks Henning and David |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request backports the node bandwidth settings merged in 8.1.0.
It cherry-picks the following changes:
The commit e235fea contains the changes for 7.17.1:
node.bandwidth.recovery.factor.readandnode.bandwidth.recovery.factor.writeare removed to only keep the operator variants set by the platform