Skip to content

cli,kv: no guaranteed state switch from DECOMMISSIONING to DECOMMISSIONED if node decommission stops early #94430

@knz

Description

@knz

Filing the issue on behalf of @mdlinville and @a-entin
Describe the problem

If EITHER

  1. the user ran the cockroach node decommission --wait=all command and then interrupted it (e.g. ctrl+c);
    OR
  2. the user ran cockroach node decommission --wait=none.

In that case, the flip from "DECOMMISSIONING" to "DECOMMISSIONED" will not happen.

The reason for that is that the state flip is effected by the CLI program at the end. Only the CLI (or its underlying API call) is able to finalize the "decommissioned" state. So if you interrupt the command, or do --wait=none, it will only flip to "decommissioned" when you run the CLI program again after decommissioning has done all its work.

Expected behavior

The state flip from "DECOMMISSIONING" to "DECOMMISSIONED" should be done automatically by the cluster even when the CLI Command is stopped before decommissioning completes.

Context:
Slack thread https://cockroachlabs.slack.com/archives/C01PHNMUFLN/p1670361351636919

Environment:

  • CockroachDB version v22.2

Jira issue: CRDB-22888

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-distributionRelating to rebalancing and leasing.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-kvKV Teamdocs-donedocs-known-limitation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions