Skip to content

[wip,dnr,dnm] server,cli: bar decommissioned nodes from re-joining the cluster#54373

Closed
irfansharif wants to merge 1 commit intocockroachdb:masterfrom
irfansharif:200910.decomm-gate
Closed

[wip,dnr,dnm] server,cli: bar decommissioned nodes from re-joining the cluster#54373
irfansharif wants to merge 1 commit intocockroachdb:masterfrom
irfansharif:200910.decomm-gate

Conversation

@irfansharif
Copy link
Copy Markdown
Contributor

*: persist a prevent startup file on decomm

Does not work for decommissioning non-live nodes. Does not actually get
checked either. Not sure if we want to use a file as such, or a store
local key. We're also arbitrarily using the first store to do such a
thing (should we just write it to every store?).

Release note: None

*: consult gating file on start up

And introduce (broken) --force flag to `cockroach node decommission`).

Release note: None

@irfansharif irfansharif requested a review from a team as a code owner September 14, 2020 22:59
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@irfansharif irfansharif removed the request for review from a team September 14, 2020 23:00
@irfansharif irfansharif force-pushed the 200910.decomm-gate branch 2 times, most recently from 525a6eb to a8077fb Compare September 23, 2020 23:44
*: consult gating file on start up

And introduce --force flag to `cockroach node decommission`).

Release note: None
@irfansharif
Copy link
Copy Markdown
Contributor Author

Abandoning this PR, @tbg is instead going to attempt the following approach:

We'll install a gossip listener near pkg/rpc that will listen in on changes to liveness records. When this listener learns of that a node is fully decommissioned, it will persist that information to a store local key/file. That file/information (also cached in-memory) will be checked in our rpc layer when heartbeating currently open connections, and when accepting new ones. This will effectively let us close out all connections to all fully decommissioned nodes. This file will also be checked during start up to populate our cache, and to maintain the running list of "nodes we shouldn't talk to anymore".

+cc @knz, this overlaps with areas I think you were planning on otherwise working on.

@irfansharif irfansharif deleted the 200910.decomm-gate branch September 24, 2020 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants