kvserver: set StoresInterval to 10s#83808
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Jul 14, 2022
Merged
Conversation
Member
Contributor
|
I'm looking at a fix atm for #81669 I think before we change the interval we should note with the amount of data transferred, at different intervals. |
kvoli
approved these changes
Jul 13, 2022
Contributor
There was a problem hiding this comment.
Running some experiments at 60/30/10s
10s seems perfectly reasonable. Only increasing by 1% egress/ingress % of total network bandwidth.
repro
export gce_project=cockroach-ephemeral
export email=austen-gossip-interval
roachprod create $email -n 32 --gce-machine-type=n1-standard-16
roachprod put $email cockroach-remote cockroach
roachprod start $email
roachprod grafana-start $email
roachprod sql $email:1 -- -e "set cluster setting gossip.stores.interval = '60s';"
sleep 600
roachprod sql $email:1 -- -e "set cluster setting gossip.stores.interval = '30s';"
sleep 600
roachprod sql $email:1 -- -e "set cluster setting gossip.stores.interval = '10s';"
sleep 600
roachprod sql $email:1 -- -e "set cluster setting gossip.stores.interval = '60s';"
roachprod run $email -- './cockroach workload run kv --tolerate-errors --min-block-bytes=127 --max-rate=1000 --max-block-bytes=256 --read-percent=95 --concurrency=256 --drop --duration=10m'
roachprod sql $email:1 -- -e "set cluster setting gossip.stores.interval = '30s';"
roachprod run $email -- './cockroach workload run kv --tolerate-errors --min-block-bytes=127 --max-rate=1000 --max-block-bytes=256 --read-percent=95 --concurrency=256 --drop --duration=10m'
roachprod sql $email:1 -- -e "set cluster setting gossip.stores.interval = '10s';"
roachprod run $email -- './cockroach workload run kv --tolerate-errors --min-block-bytes=127 --max-rate=1000 --max-block-bytes=256 --read-percent=95 --concurrency=256 --drop --duration=10m'In cockroachdb#81669 @kvoli discovered that we often gossip much more frequently than at a 60s interval. Since we are also adding I/O liveness signals to the store capacity to use in a solution for cockroachdb#79215, we want to make this more reactive interval explicit. This change has the side effect of reducing the minimum allowed value for `server.time_until_store_dead` to 25s[^1]; this seems reasonable. [^1]: https://github.com/cockroachdb/cockroach/blob/263fbb7c8fcf001fcf47d7d35894b5824c78dc14/pkg/kv/kvserver/allocator/storepool/store_pool.go#L94-L99 Release note: None
Member
Author
|
Thanks for the experiment! bors r=kvoli |
Contributor
|
Build succeeded: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

In #81669 @kvoli discovered that we often gossip much more frequently
than at a 60s interval. Since we are also adding I/O liveness signals
to the store capacity to use in a solution for #79215, we want to make
this more reactive interval explicit.
This change has the side effect of reducing the minimum allowed value
for
server.time_until_store_deadto 25s1; this seems reasonable.Release note: None
Footnotes
https://github.com/cockroachdb/cockroach/blob/263fbb7c8fcf001fcf47d7d35894b5824c78dc14/pkg/kv/kvserver/allocator/storepool/store_pool.go#L94-L99 ↩