Skip to content

Document and test operator-only node bandwidth recovery settings#83372

Merged
tlrx merged 1 commit intoelastic:masterfrom
tlrx:doc-and-test-operator-settings
Feb 2, 2022
Merged

Document and test operator-only node bandwidth recovery settings#83372
tlrx merged 1 commit intoelastic:masterfrom
tlrx:doc-and-test-operator-settings

Conversation

@tlrx
Copy link
Copy Markdown
Member

@tlrx tlrx commented Feb 1, 2022

This commit updates the Operator-only functionality doc to mention the operator only settings introduced in #82819.

It also adds an integration test for those operator only settings that would have caught #83359.

@tlrx tlrx requested a review from DaveCTurner February 1, 2022 13:54
@tlrx tlrx added :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >docs General docs changes labels Feb 1, 2022
@elasticmachine elasticmachine added Team:Distributed Meta label for distributed team. Team:Docs Meta label for docs team labels Feb 1, 2022
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-docs (Team:Docs)

Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tlrx tlrx merged commit 7f827bb into elastic:master Feb 2, 2022
@tlrx tlrx deleted the doc-and-test-operator-settings branch February 2, 2022 10:50
@tlrx
Copy link
Copy Markdown
Member Author

tlrx commented Feb 2, 2022

Thanks David!

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 2, 2022
…stic#83372)

This commit updates the Operator-only functionality doc to 
mention the operator only settings introduced in elastic#82819.

It also adds an integration test for those operator only 
settings that would have caught elastic#83359.
elasticsearchmachine pushed a commit that referenced this pull request Feb 9, 2022
…al settings (#83414)

* Adjust indices.recovery.max_bytes_per_sec according to external settings (#82819)

Today the setting indices.recovery.max_bytes_per_sec defaults to different 
values depending on the node roles, the JVM version and the system total 
memory that can be detected.

The current logic to set the default value can be summarized as:

    40 MB for non-data nodes
    40 MB for data nodes that runs on a JVM version < 14
    40 MB for data nodes that have one of the data_hot, data_warm, data_content or data roles

Nodes with only data_cold and/or data_frozen roles as data roles have a 
default value that depends of the available memory:

    with ≤ 4 GB of available memory, the default is 40 MB
    with more than 4 GB and less or equal to 8 GB, the default is 60 MB
    with more than 8 GB and less or equal to 16 GB, the default is 90 MB
    with more than 16 GB and less or equal to 32 GB, the default is 125 MB
    and above 32 GB, the default is 250 MB

While those defaults served us well, we want to evaluate if we can define 
more appropriate defaults if Elasticsearch were to know better the limits 
(or properties) of the hardware it is running on - something that Elasticsearch 
cannot extract by itself but can derive from settings that are provided at startup.

This pull request introduces the following new node settings:

    node.bandwidth.recovery.network
    node.bandwidth.recovery.disk.read
    node.bandwidth.recovery.disk.write

Those settings are not dynamic and must be set before the node starts. 
When they are set Elasticsearch detects the minimum available bandwidth 
among the network, disk read and disk write available bandwidths and computes 
a maximum bytes per seconds limit that will be a fraction of the min. available 
bandwidth. By default 40% of the min. bandwidth is used but that can be 
dynamically configured by an operator 
(using the node.bandwidth.recovery.operator.factor setting) or by the user 
directly (using a different setting node.bandwidth.recovery.factor).

The limit computed from available bandwidths is then compared to pre existing 
limitations like the one set through the indices.recovery.max_bytes_per_sec setting 
or the one that is computed by Elasticsearch from the node's physical memory 
on dedicated cold/frozen nodes. Elasticsearch will try to use the highest possible 
limit among those values, while not exceeding an overcommit ratio that is also 
defined through a node setting 
(see node.bandwidth.recovery.operator.factor.max_overcommit).

This overcommit ratio is here to prevent the rate limit to be set to a value that is 
greater than 100 times (by default) the minimum available bandwidth.

* Add missing max overcommit factor to list of (dynamic) settings (#83350)

The setting node.bandwidth.recovery.operator.factor.max_overcommit 
wasn't added to the list of cluster settings and to the list of settings to 
consume for updates.

Relates #82819

* Add docs for node bandwith settings (#83361)

Relates #82819

* Operator factor settings should have the OperatorDynamic setting property (#83359)

Relates #82819

* Document and test operator-only node bandwidth recovery settings (#83372)

This commit updates the Operator-only functionality doc to 
mention the operator only settings introduced in #82819.

It also adds an integration test for those operator only 
settings that would have caught #83359.

* remove draft

* remove docs/changelog/83350.yaml

Co-authored-by: David Turner <david.turner@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >docs General docs changes Team:Distributed Meta label for distributed team. Team:Docs Meta label for docs team v8.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants