cli: evaluate readiness prior to node decommission#96100
cli: evaluate readiness prior to node decommission#96100craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
kvoli
left a comment
There was a problem hiding this comment.
Overall LGTM, few small comments.
Reviewed 4 of 4 files at r1, 9 of 9 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @herkolategan, and @smg260)
-- commits line 27 at r2:
nit: can be controlled [...]
-- commits line 42 at r2:
nit: Is this still meant to be TODO?
pkg/cli/cliflags/flags.go line 1226 at r2 (raw file):
- enabled evaluate readiness prior to starting node decommission. - strict use strict readiness evaluation mode prior to node decommission. - skip skip readiness checks and immediately request node decommission.
Curious, how does the dry run check interacts with a running decommission? We filter out already decommissioning nodes in the prior commit - so should it should work the same?
pkg/server/admin.go line 2697 at r1 (raw file):
} // Evaluate readiness for each node to check based on how many ranges have
nit: It doesn't matter how many replicas did not pass the check - only if it is == 0 or not?
pkg/server/admin_test.go line 2544 at r1 (raw file):
} // TestDecommissionPreCheckUnready tests the functionality of the
Nice tests!
55e17f1 to
4759073
Compare
4759073 to
5a2ad1a
Compare
kvoli
left a comment
There was a problem hiding this comment.
Reviewed 11 of 11 files at r3, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @herkolategan, and @smg260)
-- commits line 35 at r2:
nit :consider renaming strict/enabled to something like non-strict
55bad21 to
8162779
Compare
This changes the functionality of `cockroach node decommission` to run
preliminary readiness checks prior to starting the decommission of the
nodes. These checks, if they evaluate and find that nodes are not ready
for decommission, will report the errors observed and on which nodes so
that the cluster's configuration can be rectified prior to reattempting
node decommission. The readiness checks are enabled by default, but can
be controlled with the following new flags:
```
--dry-run Only evaluate decommission readiness and check decommission status, without
actually decommissioning the node.
--checks string Specifies how to evaluate readiness checks prior to node decommission. Takes
any of the following values:
- enabled evaluate readiness prior to starting node decommission.
- strict use strict readiness evaluation mode prior to node decommission.
- skip skip readiness checks and immediately request node decommission.
```
Issues blocking decommission are presented grouped by node and error,
e.g.
```
$ ./cockroach node decommission 1 4 5 --checks=enabled --insecure --dry-run
id | is_live | replicas | is_decommissioning | membership | is_draining | readiness | blocking_ranges
-----+---------+----------+--------------------+------------+-------------+-------------------+------------------
1 | true | 53 | false | active | false | allocation errors | 47
4 | true | 52 | false | active | false | allocation errors | 46
5 | true | 54 | false | active | false | allocation errors | 48
(3 rows)
ranges blocking decommission detected
n1 has 34 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n1 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 13 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); replicas must match constraints [{+node1:1} {+node4:1} {+node5:1}]; voting replicas must match voter_constraints []"
n4 has 33 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
n5 has 35 replicas blocked with error: "0 of 1 live stores are able to take a new replica for the range (2 already have a voter, 0 already have a non-voter); likely not enough nodes in cluster"
...more blocking errors detected.
ERROR: Cannot decommission nodes.
Failed running "node decommission"
```
Fixes: cockroachdb#91893
Release note (cli change): `cockroach node decommission` operations now
preliminarily check the ability of the node to complete decommissioning,
given the cluster configuration and the ranges with replicas present
on the node. This step can be skipped by using the flag `--checks=skip`.
When errors are detected that would result in the inability to complete
node decommission, they will be printed to stderr and the command will
exit, instead of marking the node as `decommissioning` and beginning the
node decommission process. When the strict readiness evaluation mode
is used by setting the flag `--checks=strict`, any ranges that need any
preliminary actions prior to replacement for the decommission process
(e.g. ranges that are not yet fully upreplicated) will block the
decommission process.
|
bors r+ |
|
bors r=kvoli |
|
Build failed: |
|
bors r=kvoli |
|
Build succeeded: |
This change fixes decommission roachtests to properly be aware of the new output introduced by cockroachdb#96100, and to utilize the decommission pre-checks accordingly. In some places, the decommission pre-checks are skipped, especially because the decommission used in the test is not expected to be completeable. Fixes: cockroachdb#98026, cockroachdb#98018, cockroachdb#98017. Release note: None
This change fixes decommission roachtests to properly be aware of the new output introduced by cockroachdb#96100, and to utilize the decommission pre-checks accordingly. In some places, the decommission pre-checks are skipped, especially because the decommission used in the test is not expected to be completeable. Fixes: cockroachdb#98026, cockroachdb#98018, cockroachdb#98017. Release note: None
97766: kvflowcontrol: implement kvflowcontrol.Dispatch r=irfansharif a=irfansharif Part of #95563. Dispatch is a concrete implementation of the kvflowcontrol.Dispatch interface. It's used to dispatch information about admitted raft log entries to the specific nodes where (i) said entries originated, (ii) flow tokens were deducted and (iii) are waiting to be returned. This type is also used to read pending dispatches, which will be used in the raft transport layer when looking to piggyback information on traffic already bound to specific nodes. Since timely dispatching (read: piggybacking) is not guaranteed, we allow querying for all long-overdue dispatches. Internally it's able to coalesce dispatches bound for a given node. If dispatching admission information for two log entries with the same <RangeID,StoreID,WorkPriority> triple, with log positions L1 and L2 where L1 < L2, we can simply dispatch the one with L2. We leave the integration of this type with the {Store,}WorkQueue (#97599) + raft transport to future PRs. Release note: None 97826: sql: introduce array_cat_agg aggregate builtin r=yuzefovich a=yuzefovich This commit introduces a new `array_cat_agg` aggregate builtin function that takes in an array type as its input, and then unnests each array and appends all its elements into a single result array. In other words, it behaves similar to `array_agg(unnest(array_column))`. This function doesn't have an analogue in Postgres. However, some of our SQL observability tools need this functionality, and the current workaround of using a LATERAL JOIN often results in slow apply joins, so this new builtin should speed things up significantly. In particular, `crdb_internal.statement_statistics` view is now refactored to use the new builtin which removes an apply join from it. The choice of this particular name comes from the fact that we have the `array_cat` builtin which concatenates two arrays. Fixes: #97502. Release note (sql change): New aggregate builtin function `array_cat_agg` is introduced. It behaves similar to how `array_agg(unnest(array_column))` would - namely, it takes arrays as its input, unnests them into the array elements which are then aggregated into a single result array (i.e. it's similar to concatenating all input arrays into a single one). 98079: generate-bazel-extra: add instructions to update timeouts list r=rickystewart a=healthy-pod This code change removes `pkg/ccl/backupccl` from the really enormous targets list because it's only there for CI stress purposes. It also adds instructions to follow when adding a new test target to the list of really enormous timeouts. Release note: None Epic: none 98113: roachtest: decommission roachtests conform to new output r=kvoli a=AlexTalks This change fixes decommission roachtests to properly be aware of the new output introduced by #96100, and to utilize the decommission pre-checks accordingly. In some places, the decommission pre-checks are skipped, especially because the decommission used in the test is not expected to be completeable. Fixes: #98026, #98018, #98017. Release note: None Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: healthy-pod <ahmad@cockroachlabs.com> Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
With the introduction of decommission pre-checks executed by the CLI in cockroachdb#96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned. Fixes: cockroachdb#98149. Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0.
With the introduction of decommission pre-checks executed by the CLI in cockroachdb#96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned. Fixes: cockroachdb#98149. Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0.
103555: cli: exit without error on repeated decommission r=AlexTalks a=AlexTalks With the introduction of decommission pre-checks executed by the CLI in #96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned. Fixes: #98149. Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0. Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
With the introduction of decommission pre-checks executed by the CLI in #96100, the decommission readiness was used to exit with a non-zero exit code when a node is not ready to be decommissioned. However, as the decommission command is intended to be idempotent, decommissioning an already decommissioned node should not be considered an error, especially since we already print a warning stating as such. This change fixes that check to exit with a code of 0 if a node has already been decommissioned. Fixes: #98149. Release note (cli change): Running `cockroch node decommission <nodeID>` for a node that has already been decommissioned will now exit with code 0, as had been the case in CockroachDB versions prior to 23.1.0.
This change adds a new interactive tests for the decommission pre-checks executed by the CLI that were introduced in cockroachdb#96100. The tests validate the output and exit codes of the decommission command when the pre-checks are enabled (as they are by default). Epic: none Release note: none
This changes the functionality of
cockroach node decommissionto runpreliminary readiness checks prior to starting the decommission of the
nodes. These checks, if they evaluate and find that nodes are not ready
for decommission, will report the errors observed and on which nodes so
that the cluster's configuration can be rectified prior to reattempting
node decommission. The readiness checks are enabled by default, but can
be controlled with the following new flags:
Issues blocking decommission are presented grouped by node and error, e.g.
Fixes: #91893
Release note (cli change):
cockroach node decommissionoperations nowpreliminarily check the ability of the node to complete decommissioning,
given the cluster configuration and the ranges with replicas present
on the node. This step can be skipped by using the flag
--checks=skip.When errors are detected that would result in the inability to complete
node decommission, they will be printed to stderr and the command will
exit, instead of marking the node as
decommissioningand beginning thenode decommission process. When the strict readiness evaluation mode
is used by setting the flag
--checks=strict, any ranges that need anypreliminary actions prior to replacement for the decommission process
(e.g. ranges that are not yet fully upreplicated) will block the
decommission process.