server: don't drain when decommissioning#28707
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Aug 16, 2018
Merged
Conversation
Member
068f8f9 to
89a2852
Compare
nvb
approved these changes
Aug 16, 2018
Contributor
nvb
left a comment
There was a problem hiding this comment.
but I think you mean "don't drain when decommissioning" in the commit/PR title.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/acceptance/decommission_test.go, line 284 at r1 (raw file):
exp := [][]string{ decommissionHeader, // Expect the same usual, except this time the node should be draining
s/same usual/same as usual/
Prior to this commit, a server which enters decommissioning would automatically drain. This wasn't a great idea since it meant that pointing the decommissioning command at a set of nodes required for the cluster to work would brick that cluster, and would be difficult to recover from (since the draining state is picked up by the nodes when they start). Instead, decouple draining from decommissioning. Decommissioning simply tells the allocator to move data off that node; when the node gets shut down cleanly (the operator's responsiblity), it will drain. The resulting behavior when trying to decommission a too-large set of nodes is now that the decommission command will simply not finish. For example, starting a three node cluster and decommissioning all nodes will simply hang (though the nodes will be marked as decommissioning). Add three more nodes to the cluster and replicas will move over to the newly added nodes, and the decommissioning command will finish. As a bonus, recommissioning a node now doesn't require the target node to restart. As a second bonus, the decommissioning acceptance test now takes around 60% of the previous time (~45s down from 70+). One remaining caveat is that users may forget that they attempted to decommission nodes. We need to check that we prominently alert in the UI when nodes are decommissioning. This isn't a new problem. Fixes cockroachdb#27444. Fixes cockroachdb#27025. Release note (bug fix): decommissioning multiple nodes is now possible without posing a risk to cluster health. Recommissioning a node does no longer require a restart of the target nodes to take effect.
89a2852 to
f7c66c1
Compare
Member
Author
|
Changed both, TFTR! bors r=nvanbenschoten |
craig bot
pushed a commit
that referenced
this pull request
Aug 16, 2018
28684: storage: make subsumption atomic with merge commit application r=bdarnell,tschottdorf a=benesch ~Please review the first commit separately in #28661.~ During a merge, the subsumed range logically ceases to exist at the moment that the merge transaction commit applies. It's important that we remove the subsumed range from disk in the same batch as the merge commit to protect against an ill-timed crash. Tease apart Replica.destroyRaftMuLocked to make this possible. Release note: None 28707: server: don't drain when decommissioning r=nvanbenschoten a=tschottdorf Prior to this commit, a server which enters decommissioning would automatically drain. This wasn't a great idea since it meant that pointing the decommissioning command at a set of nodes required for the cluster to work would brick that cluster, and would be difficult to recover from (since the draining state is picked up by the nodes when they start). Instead, decouple draining from decommissioning. Decommissioning simply tells the allocator to move data off that node; when the node gets shut down cleanly (the operator's responsiblity), it will drain. The resulting behavior when trying to decommission a too-large set of nodes is now that the decommission command will simply not finish. For example, starting a three node cluster and decommissioning all nodes will simply hang (though the nodes will be marked as decommissioning). Add three more nodes to the cluster and replicas will move over to the newly added nodes, and the decommissioning command will finish. As a bonus, recommissioning a node now doesn't require the target node to restart. As a second bonus, the decommissioning acceptance test now takes around 60% of the previous time (~45s down from 70+). One remaining caveat is that users may forget that they attempted to decommission nodes. We need to check that we prominently alert in the UI when nodes are decommissioning. This isn't a new problem. Fixes #27444. Fixes #27025. Release note (bug fix): decommissioning multiple nodes is now possible without posing a risk to cluster health. Recommissioning a node does no longer require a restart of the target nodes to take effect. Co-authored-by: Nikhil Benesch <nikhil.benesch@gmail.com> Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
Contributor
Build succeeded |
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Sep 18, 2018
This is somehow the salient part of cockroachdb#28707 that I managed to miss during the initial backport of that PR: we don't want to set the node to draining when it is being decommissioned. Note that the /health?ready=1 endpoint will continue to return a 503 error once decommissioning starts due to a readiness check (something I am preparing a PR against master for). Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Sep 25, 2018
This is somehow the salient part of cockroachdb#28707 that I managed to miss during the initial backport of that PR: we don't want to set the node to draining when it is being decommissioned. Note that the /health?ready=1 endpoint will continue to return a 503 error once decommissioning starts due to a readiness check (something I am preparing a PR against master for). Release note: None Release note: None
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Prior to this commit, a server which enters decommissioning would
automatically drain. This wasn't a great idea since it meant that
pointing the decommissioning command at a set of nodes required
for the cluster to work would brick that cluster, and would be
difficult to recover from (since the draining state is picked
up by the nodes when they start).
Instead, decouple draining from decommissioning. Decommissioning simply
tells the allocator to move data off that node; when the node gets shut
down cleanly (the operator's responsiblity), it will drain.
The resulting behavior when trying to decommission a too-large set of
nodes is now that the decommission command will simply not finish.
For example, starting a three node cluster and decommissioning all
nodes will simply hang (though the nodes will be marked as
decommissioning). Add three more nodes to the cluster and replicas
will move over to the newly added nodes, and the decommissioning
command will finish.
As a bonus, recommissioning a node now doesn't require the target
node to restart.
As a second bonus, the decommissioning acceptance test now takes
around 60% of the previous time (~45s down from 70+).
One remaining caveat is that users may forget that they attempted
to decommission nodes. We need to check that we prominently alert
in the UI when nodes are decommissioning. This isn't a new problem.
Fixes #27444.
Fixes #27025.
Release note (bug fix): decommissioning multiple nodes is now possible
without posing a risk to cluster health. Recommissioning a node does
no longer require a restart of the target nodes to take effect.