cli: evaluate readiness prior to node decommission by AlexTalks · Pull Request #96100 · cockroachdb/cockroach

AlexTalks · 2023-01-27T16:14:22Z

This changes the functionality of cockroach node decommission to run
preliminary readiness checks prior to starting the decommission of the
nodes. These checks, if they evaluate and find that nodes are not ready
for decommission, will report the errors observed and on which nodes so
that the cluster's configuration can be rectified prior to reattempting
node decommission. The readiness checks are enabled by default, but can
be controlled with the following new flags:

--dry-run               Only evaluate decommission readiness and check decommission status, without
                        actually decommissioning the node.

--checks string         Specifies how to evaluate readiness checks prior to node decommission. Takes
                        any of the following values:
                            - enabled  evaluate readiness prior to starting node decommission.
                            - strict   use strict readiness evaluation mode prior to node decommission.
                            - skip     skip readiness checks and immediately request node decommission.

Issues blocking decommission are presented grouped by node and error, e.g.

$ ./cockroach node decommission 1 4 5 --insecure

  id | is_live | replicas | is_decommissioning | membership | is_draining |     readiness     | blocking_ranges
-----+---------+----------+--------------------+------------+-------------+-------------------+------------------
   1 |  true   |       53 |       false        |   active   |    false    | allocation errors |              47
   4 |  true   |       52 |       false        |   active   |    false    | allocation errors |              46
   5 |  true   |       54 |       false        |   active   |    false    | allocation errors |              48
(3 rows)

ranges blocking decommission detected

n1 has 34 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n1 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 33 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n5 has 35 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
...more blocking errors detected.

ERROR: Cannot decommission nodes.
Failed running "node decommission"

Fixes: #91893

Release note (cli change): cockroach node decommission operations now
preliminarily check the ability of the node to complete decommissioning,
given the cluster configuration and the ranges with replicas present
on the node. This step can be skipped by using the flag --checks=skip.
When errors are detected that would result in the inability to complete
node decommission, they will be printed to stderr and the command will
exit, instead of marking the node as decommissioning and beginning the
node decommission process. When the strict readiness evaluation mode
is used by setting the flag --checks=strict, any ranges that need any
preliminary actions prior to replacement for the decommission process
(e.g. ranges that are not yet fully upreplicated) will block the
decommission process.

cockroach-teamcity · 2023-01-27T16:14:36Z

This change is

kvoli

Overall LGTM, few small comments.

Reviewed 4 of 4 files at r1, 9 of 9 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @herkolategan, and @smg260)

-- commits line 27 at r2:
nit: can be controlled [...]

-- commits line 42 at r2:
nit: Is this still meant to be TODO?

pkg/cli/cliflags/flags.go line 1226 at r2 (raw file):

  - enabled  evaluate readiness prior to starting node decommission.
  - strict   use strict readiness evaluation mode prior to node decommission.
  - skip     skip readiness checks and immediately request node decommission.

Curious, how does the dry run check interacts with a running decommission? We filter out already decommissioning nodes in the prior commit - so should it should work the same?

pkg/server/admin.go line 2697 at r1 (raw file):

	}

	// Evaluate readiness for each node to check based on how many ranges have

nit: It doesn't matter how many replicas did not pass the check - only if it is == 0 or not?

pkg/server/admin_test.go line 2544 at r1 (raw file):

}

// TestDecommissionPreCheckUnready tests the functionality of the

Nice tests!

kvoli

Reviewed 11 of 11 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @herkolategan, and @smg260)

-- commits line 35 at r2:
nit :consider renaming strict/enabled to something like non-strict

This changes the functionality of `cockroach node decommission` to run preliminary readiness checks prior to starting the decommission of the nodes. These checks, if they evaluate and find that nodes are not ready for decommission, will report the errors observed and on which nodes so that the cluster's configuration can be rectified prior to reattempting node decommission. The readiness checks are enabled by default, but can be controlled with the following new flags: ``` --dry-run Only evaluate decommission readiness and check decommission status, without actually decommissioning the node. --checks string Specifies how to evaluate readiness checks prior to node decommission. Takes any of the following values: - enabled evaluate readiness prior to starting node decommission. - strict use strict readiness evaluation mode prior to node decommission. - skip skip readiness checks and immediately request node decommission. ``` Issues blocking decommission are presented grouped by node and error, e.g. ``` $ ./cockroach node decommission 1 4 5 --checks=enabled --insecure --dry-run id | is_live | replicas | is_decommissioning | membership | is_draining | readiness | blocking_ranges -----+---------+----------+--------------------+------------+-------------+-------------------+------------------ 1 | true | 53 | false | active | false | allocation errors | 47 4 | true | 52 | false | active | false | allocation errors | 46 5 | true | 54 | false | active | false | allocation errors | 48 (3 rows) ranges blocking decommission detected n1 has 34 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster" n1 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []" n4 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []" n4 has 33 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster" n5 has 35 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster" ...more blocking errors detected. ERROR: Cannot decommission nodes. Failed running "node decommission" ``` Fixes: cockroachdb#91893 Release note (cli change): `cockroach node decommission` operations now preliminarily check the ability of the node to complete decommissioning, given the cluster configuration and the ranges with replicas present on the node. This step can be skipped by using the flag `--checks=skip`. When errors are detected that would result in the inability to complete node decommission, they will be printed to stderr and the command will exit, instead of marking the node as `decommissioning` and beginning the node decommission process. When the strict readiness evaluation mode is used by setting the flag `--checks=strict`, any ranges that need any preliminary actions prior to replacement for the decommission process (e.g. ranges that are not yet fully upreplicated) will block the decommission process.

AlexTalks · 2023-03-02T18:42:25Z

bors r+

AlexTalks · 2023-03-02T22:50:53Z

bors r=kvoli

craig · 2023-03-02T23:21:31Z

Build failed:

Bazel Essential CI (Cockroach)

AlexTalks · 2023-03-03T00:56:06Z

bors r=kvoli

craig · 2023-03-03T02:19:16Z

Build succeeded:

Bazel Essential CI (Cockroach)

This change fixes decommission roachtests to properly be aware of the new output introduced by cockroachdb#96100, and to utilize the decommission pre-checks accordingly. In some places, the decommission pre-checks are skipped, especially because the decommission used in the test is not expected to be completeable. Fixes: cockroachdb#98026, cockroachdb#98018, cockroachdb#98017. Release note: None

97766: kvflowcontrol: implement kvflowcontrol.Dispatch r=irfansharif a=irfansharif Part of #95563. Dispatch is a concrete implementation of the kvflowcontrol.Dispatch interface. It's used to dispatch information about admitted raft log entries to the specific nodes where (i) said entries originated, (ii) flow tokens were deducted and (iii) are waiting to be returned. This type is also used to read pending dispatches, which will be used in the raft transport layer when looking to piggyback information on traffic already bound to specific nodes. Since timely dispatching (read: piggybacking) is not guaranteed, we allow querying for all long-overdue dispatches. Internally it's able to coalesce dispatches bound for a given node. If dispatching admission information for two log entries with the same <RangeID,StoreID,WorkPriority> triple, with log positions L1 and L2 where L1 < L2, we can simply dispatch the one with L2. We leave the integration of this type with the {Store,}WorkQueue (#97599) + raft transport to future PRs. Release note: None 97826: sql: introduce array_cat_agg aggregate builtin r=yuzefovich a=yuzefovich This commit introduces a new `array_cat_agg` aggregate builtin function that takes in an array type as its input, and then unnests each array and appends all its elements into a single result array. In other words, it behaves similar to `array_agg(unnest(array_column))`. This function doesn't have an analogue in Postgres. However, some of our SQL observability tools need this functionality, and the current workaround of using a LATERAL JOIN often results in slow apply joins, so this new builtin should speed things up significantly. In particular, `crdb_internal.statement_statistics` view is now refactored to use the new builtin which removes an apply join from it. The choice of this particular name comes from the fact that we have the `array_cat` builtin which concatenates two arrays. Fixes: #97502. Release note (sql change): New aggregate builtin function `array_cat_agg` is introduced. It behaves similar to how `array_agg(unnest(array_column))` would - namely, it takes arrays as its input, unnests them into the array elements which are then aggregated into a single result array (i.e. it's similar to concatenating all input arrays into a single one). 98079: generate-bazel-extra: add instructions to update timeouts list r=rickystewart a=healthy-pod This code change removes `pkg/ccl/backupccl` from the really enormous targets list because it's only there for CI stress purposes. It also adds instructions to follow when adding a new test target to the list of really enormous timeouts. Release note: None Epic: none 98113: roachtest: decommission roachtests conform to new output r=kvoli a=AlexTalks This change fixes decommission roachtests to properly be aware of the new output introduced by #96100, and to utilize the decommission pre-checks accordingly. In some places, the decommission pre-checks are skipped, especially because the decommission used in the test is not expected to be completeable. Fixes: #98026, #98018, #98017. Release note: None Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: healthy-pod <ahmad@cockroachlabs.com> Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>

With the introduction of decommission pre-checks executed by the CLI in cockroachdb#96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned. Fixes: cockroachdb#98149. Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0.

103555: cli: exit without error on repeated decommission r=AlexTalks a=AlexTalks With the introduction of decommission pre-checks executed by the CLI in #96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned. Fixes: #98149. Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0. Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>

With the introduction of decommission pre-checks executed by the CLI in #96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned. Fixes: #98149. Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0.

This change adds a new interactive tests for the decommission pre-checks executed by the CLI that were introduced in cockroachdb#96100. The tests validate the output and exit codes of the decommission command when the pre-checks are enabled (as they are by default). Epic: none Release note: none

AlexTalks requested review from a team as code owners January 27, 2023 16:14

AlexTalks requested a review from a team January 27, 2023 16:14

AlexTalks requested review from a team as code owners January 27, 2023 16:14

AlexTalks requested review from herkolategan and smg260 and removed request for a team January 27, 2023 16:14

kvoli reviewed Feb 6, 2023

View reviewed changes

dhartunian removed the request for review from a team February 6, 2023 15:55

AlexTalks force-pushed the dpf_client branch 2 times, most recently from 55e17f1 to 4759073 Compare February 7, 2023 23:57

AlexTalks force-pushed the dpf_client branch from 4759073 to 5a2ad1a Compare February 16, 2023 01:24

kvoli approved these changes Feb 22, 2023

View reviewed changes

AlexTalks force-pushed the dpf_client branch 4 times, most recently from 55bad21 to 8162779 Compare March 2, 2023 17:37

AlexTalks force-pushed the dpf_client branch from 8162779 to cefa614 Compare March 2, 2023 17:48

craig bot merged commit 8c54290 into cockroachdb:master Mar 3, 2023

cockroach-teamcity mentioned this pull request Mar 3, 2023

PR #96100 - cli: evaluate readiness prior to node decommission cockroachdb/docs#16410

Closed

AlexTalks mentioned this pull request Mar 7, 2023

roachtest: decommission roachtests conform to new output #98113

Merged

AlexTalks mentioned this pull request Mar 8, 2023

cli, kv: evaluate latency impact of pre-checks on decommission. #98246

Open

AlexTalks mentioned this pull request May 17, 2023

cli: exit without error on repeated decommission #103555

Merged

AlexTalks mentioned this pull request May 25, 2023

cli: exit without error on repeated decommission AlexTalks/cockroach#1

Closed

blathers-crl bot mentioned this pull request May 26, 2023

release-23.1: cli: exit without error on repeated decommission #103933

Merged

AlexTalks mentioned this pull request Jun 16, 2023

cli: add interactive tests for decommissioning #105029

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli: evaluate readiness prior to node decommission#96100

cli: evaluate readiness prior to node decommission#96100
craig[bot] merged 1 commit intocockroachdb:masterfrom
AlexTalks:dpf_client

AlexTalks commented Jan 27, 2023 •

edited

Loading

Uh oh!

cockroach-teamcity commented Jan 27, 2023

Uh oh!

kvoli left a comment

Uh oh!

kvoli left a comment

Uh oh!

AlexTalks commented Mar 2, 2023

Uh oh!

AlexTalks commented Mar 2, 2023

Uh oh!

craig bot commented Mar 2, 2023

Uh oh!

AlexTalks commented Mar 3, 2023

Uh oh!

craig bot commented Mar 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AlexTalks commented Jan 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Jan 27, 2023

Uh oh!

kvoli left a comment

Choose a reason for hiding this comment

Uh oh!

kvoli left a comment

Choose a reason for hiding this comment

Uh oh!

AlexTalks commented Mar 2, 2023

Uh oh!

AlexTalks commented Mar 2, 2023

Uh oh!

craig bot commented Mar 2, 2023

Uh oh!

AlexTalks commented Mar 3, 2023

Uh oh!

craig bot commented Mar 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AlexTalks commented Jan 27, 2023 •

edited

Loading