Skip to content

cli: evaluate readiness prior to node decommission#96100

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
AlexTalks:dpf_client
Mar 3, 2023
Merged

cli: evaluate readiness prior to node decommission#96100
craig[bot] merged 1 commit intocockroachdb:masterfrom
AlexTalks:dpf_client

Conversation

@AlexTalks
Copy link
Copy Markdown
Contributor

@AlexTalks AlexTalks commented Jan 27, 2023

This changes the functionality of cockroach node decommission to run
preliminary readiness checks prior to starting the decommission of the
nodes. These checks, if they evaluate and find that nodes are not ready
for decommission, will report the errors observed and on which nodes so
that the cluster's configuration can be rectified prior to reattempting
node decommission. The readiness checks are enabled by default, but can
be controlled with the following new flags:

--dry-run               Only evaluate decommission readiness and check decommission status, without
                        actually decommissioning the node.

--checks string         Specifies how to evaluate readiness checks prior to node decommission. Takes
                        any of the following values:
                            - enabled  evaluate readiness prior to starting node decommission.
                            - strict   use strict readiness evaluation mode prior to node decommission.
                            - skip     skip readiness checks and immediately request node decommission.

Issues blocking decommission are presented grouped by node and error, e.g.

$ ./cockroach node decommission 1 4 5 --insecure

  id | is_live | replicas | is_decommissioning | membership | is_draining |     readiness     | blocking_ranges
-----+---------+----------+--------------------+------------+-------------+-------------------+------------------
   1 |  true   |       53 |       false        |   active   |    false    | allocation errors |              47
   4 |  true   |       52 |       false        |   active   |    false    | allocation errors |              46
   5 |  true   |       54 |       false        |   active   |    false    | allocation errors |              48
(3 rows)

ranges blocking decommission detected

n1 has 34 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n1 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 33 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n5 has 35 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
...more blocking errors detected.

ERROR: Cannot decommission nodes.
Failed running "node decommission"

Fixes: #91893

Release note (cli change): cockroach node decommission operations now
preliminarily check the ability of the node to complete decommissioning,
given the cluster configuration and the ranges with replicas present
on the node. This step can be skipped by using the flag --checks=skip.
When errors are detected that would result in the inability to complete
node decommission, they will be printed to stderr and the command will
exit, instead of marking the node as decommissioning and beginning the
node decommission process. When the strict readiness evaluation mode
is used by setting the flag --checks=strict, any ranges that need any
preliminary actions prior to replacement for the decommission process
(e.g. ranges that are not yet fully upreplicated) will block the
decommission process.

@AlexTalks AlexTalks requested review from a team as code owners January 27, 2023 16:14
@AlexTalks AlexTalks requested a review from a team January 27, 2023 16:14
@AlexTalks AlexTalks requested review from a team as code owners January 27, 2023 16:14
@AlexTalks AlexTalks requested review from herkolategan and smg260 and removed request for a team January 27, 2023 16:14
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, few small comments.

Reviewed 4 of 4 files at r1, 9 of 9 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @herkolategan, and @smg260)


-- commits line 27 at r2:
nit: can be controlled [...]


-- commits line 42 at r2:
nit: Is this still meant to be TODO?


pkg/cli/cliflags/flags.go line 1226 at r2 (raw file):

  - enabled  evaluate readiness prior to starting node decommission.
  - strict   use strict readiness evaluation mode prior to node decommission.
  - skip     skip readiness checks and immediately request node decommission.

Curious, how does the dry run check interacts with a running decommission? We filter out already decommissioning nodes in the prior commit - so should it should work the same?


pkg/server/admin.go line 2697 at r1 (raw file):

	}

	// Evaluate readiness for each node to check based on how many ranges have

nit: It doesn't matter how many replicas did not pass the check - only if it is == 0 or not?


pkg/server/admin_test.go line 2544 at r1 (raw file):

}

// TestDecommissionPreCheckUnready tests the functionality of the

Nice tests!

@dhartunian dhartunian removed the request for review from a team February 6, 2023 15:55
@AlexTalks AlexTalks force-pushed the dpf_client branch 2 times, most recently from 55e17f1 to 4759073 Compare February 7, 2023 23:57
Copy link
Copy Markdown
Contributor

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 11 of 11 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @herkolategan, and @smg260)


-- commits line 35 at r2:
nit :consider renaming strict/enabled to something like non-strict

@AlexTalks AlexTalks force-pushed the dpf_client branch 4 times, most recently from 55bad21 to 8162779 Compare March 2, 2023 17:37
This changes the functionality of `cockroach node decommission` to run
preliminary readiness checks prior to starting the decommission of the
nodes. These checks, if they evaluate and find that nodes are not ready
for decommission, will report the errors observed and on which nodes so
that the cluster's configuration can be rectified prior to reattempting
node decommission.  The readiness checks are enabled by default, but can
be controlled with the following new flags:
```
--dry-run               Only evaluate decommission readiness and check decommission status, without
                        actually decommissioning the node.

--checks string         Specifies how to evaluate readiness checks prior to node decommission. Takes
                        any of the following values:
                            - enabled  evaluate readiness prior to starting node decommission.
                            - strict   use strict readiness evaluation mode prior to node decommission.
                            - skip     skip readiness checks and immediately request node decommission.
```

Issues blocking decommission are presented grouped by node and error,
e.g.
```
$ ./cockroach node decommission 1 4 5 --checks=enabled --insecure --dry-run

  id | is_live | replicas | is_decommissioning | membership | is_draining |     readiness     | blocking_ranges
-----+---------+----------+--------------------+------------+-------------+-------------------+------------------
   1 |  true   |       53 |       false        |   active   |    false    | allocation errors |              47
   4 |  true   |       52 |       false        |   active   |    false    | allocation errors |              46
   5 |  true   |       54 |       false        |   active   |    false    | allocation errors |              48
(3 rows)

ranges blocking decommission detected

n1 has 34 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n1 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 33 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n5 has 35 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
...more blocking errors detected.

ERROR: Cannot decommission nodes.
Failed running "node decommission"
```

Fixes: cockroachdb#91893

Release note (cli change): `cockroach node decommission` operations now
preliminarily check the ability of the node to complete decommissioning,
given the cluster configuration and the ranges with replicas present
on the node. This step can be skipped by using the flag `--checks=skip`.
When errors are detected that would result in the inability to complete
node decommission, they will be printed to stderr and the command will
exit, instead of marking the node as `decommissioning` and beginning the
node decommission process. When the strict readiness evaluation mode
is used by setting the flag `--checks=strict`, any ranges that need any
preliminary actions prior to replacement for the decommission process
(e.g. ranges that are not yet fully upreplicated) will block the
decommission process.
@AlexTalks
Copy link
Copy Markdown
Contributor Author

bors r+

@AlexTalks
Copy link
Copy Markdown
Contributor Author

bors r=kvoli

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 2, 2023

Build failed:

@AlexTalks
Copy link
Copy Markdown
Contributor Author

bors r=kvoli

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 3, 2023

Build succeeded:

@craig craig bot merged commit 8c54290 into cockroachdb:master Mar 3, 2023
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Mar 7, 2023
This change fixes decommission roachtests to properly be aware of the
new output introduced by cockroachdb#96100, and to utilize the decommission
pre-checks accordingly. In some places, the decommission pre-checks are
skipped, especially because the decommission used in the test is not
expected to be completeable.

Fixes: cockroachdb#98026, cockroachdb#98018, cockroachdb#98017.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Mar 7, 2023
This change fixes decommission roachtests to properly be aware of the
new output introduced by cockroachdb#96100, and to utilize the decommission
pre-checks accordingly. In some places, the decommission pre-checks are
skipped, especially because the decommission used in the test is not
expected to be completeable.

Fixes: cockroachdb#98026, cockroachdb#98018, cockroachdb#98017.

Release note: None
craig bot pushed a commit that referenced this pull request Mar 7, 2023
97766: kvflowcontrol: implement kvflowcontrol.Dispatch r=irfansharif a=irfansharif

Part of #95563. Dispatch is a concrete implementation of the kvflowcontrol.Dispatch interface. It's used to dispatch information about admitted raft log entries to the specific nodes where (i) said entries originated, (ii) flow tokens were deducted and (iii) are waiting to be returned. This type is also used to read pending dispatches, which will be used in the raft transport layer when looking to piggyback information on traffic already bound to specific nodes. Since timely dispatching (read: piggybacking) is not guaranteed, we allow querying for all long-overdue dispatches.

Internally it's able to coalesce dispatches bound for a given node. If dispatching admission information for two log entries with the same <RangeID,StoreID,WorkPriority> triple, with log positions L1 and L2 where L1 < L2, we can simply dispatch the one with L2. We leave the integration of this type with the {Store,}WorkQueue (#97599) + raft transport to future PRs.

Release note: None

97826: sql: introduce array_cat_agg aggregate builtin r=yuzefovich a=yuzefovich

This commit introduces a new `array_cat_agg` aggregate builtin function
that takes in an array type as its input, and then unnests each array
and appends all its elements into a single result array. In other
words, it behaves similar to `array_agg(unnest(array_column))`. This
function doesn't have an analogue in Postgres. However, some of our SQL
observability tools need this functionality, and the current workaround
of using a LATERAL JOIN often results in slow apply joins, so this new
builtin should speed things up significantly. In particular,
`crdb_internal.statement_statistics` view is now refactored to use the
new builtin which removes an apply join from it. The choice of this
particular name comes from the fact that we have the `array_cat` builtin
which concatenates two arrays.

Fixes: #97502.

Release note (sql change): New aggregate builtin function
`array_cat_agg` is introduced. It behaves similar to how
`array_agg(unnest(array_column))` would - namely, it takes arrays as its
input, unnests them into the array elements which are then aggregated
into a single result array (i.e. it's similar to concatenating all input
arrays into a single one).

98079: generate-bazel-extra: add instructions to update timeouts list r=rickystewart a=healthy-pod

This code change removes `pkg/ccl/backupccl` from the really enormous targets list because it's only there for CI stress purposes.

It also adds instructions to follow when adding a new test target to the list of really enormous timeouts.

Release note: None
Epic: none

98113: roachtest: decommission roachtests conform to new output r=kvoli a=AlexTalks

This change fixes decommission roachtests to properly be aware of the new output introduced by #96100, and to utilize the decommission pre-checks accordingly. In some places, the decommission pre-checks are skipped, especially because the decommission used in the test is not expected to be completeable.

Fixes: #98026, #98018, #98017.

Release note: None

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: healthy-pod <ahmad@cockroachlabs.com>
Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request May 25, 2023
With the introduction of decommission pre-checks executed by the CLI in cockroachdb#96100,
the decommission readiness was used to exit with a non-zero exit code
when a node is not ready to be decommissioned. However, as the
decommission command is intended to be idempotent, decommissioning an
already decommissioned node should not be considered an error,
especially since we already print a warning stating as such. This change
fixes that check to exit with a code of 0 if a node has already been
decommissioned.

Fixes: cockroachdb#98149.

Release note (cli change): Running `cockroch node decommission <nodeID>`
for a node that has already been decommissioned will now exit with code
0, as had been the case in CockroachDB versions prior to 23.1.0.
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request May 25, 2023
With the introduction of decommission pre-checks executed by the CLI in cockroachdb#96100,
the decommission readiness was used to exit with a non-zero exit code
when a node is not ready to be decommissioned. However, as the
decommission command is intended to be idempotent, decommissioning an
already decommissioned node should not be considered an error,
especially since we already print a warning stating as such. This change
fixes that check to exit with a code of 0 if a node has already been
decommissioned.

Fixes: cockroachdb#98149.

Release note (cli change): Running `cockroch node decommission <nodeID>`
for a node that has already been decommissioned will now exit with code
0, as had been the case in CockroachDB versions prior to 23.1.0.
craig bot pushed a commit that referenced this pull request May 26, 2023
103555: cli: exit without error on repeated decommission r=AlexTalks a=AlexTalks

With the introduction of decommission pre-checks executed by the CLI in #96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned.

Fixes: #98149.

Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0.

Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this pull request May 26, 2023
With the introduction of decommission pre-checks executed by the CLI in #96100,
the decommission readiness was used to exit with a non-zero exit code
when a node is not ready to be decommissioned. However, as the
decommission command is intended to be idempotent, decommissioning an
already decommissioned node should not be considered an error,
especially since we already print a warning stating as such. This change
fixes that check to exit with a code of 0 if a node has already been
decommissioned.

Fixes: #98149.

Release note (cli change): Running `cockroch node decommission <nodeID>`
for a node that has already been decommissioned will now exit with code
0, as had been the case in CockroachDB versions prior to 23.1.0.
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Jun 16, 2023
This change adds a new interactive tests for the decommission pre-checks
executed by the CLI that were introduced in cockroachdb#96100. The tests validate
the output and exit codes of the decommission command when the
pre-checks are enabled (as they are by default).

Epic: none

Release note: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cli: implement CLI flags for decommission pre-check

3 participants