Skip to content

server: introduce the Migration service#56476

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
irfansharif:201109.migrations-service
Nov 11, 2020
Merged

server: introduce the Migration service#56476
craig[bot] merged 1 commit intocockroachdb:masterfrom
irfansharif:201109.migrations-service

Conversation

@irfansharif
Copy link
Copy Markdown
Contributor

@irfansharif irfansharif commented Nov 10, 2020

The upcoming migration manager (prototyped in #56107) will want to
execute a few known RPCs on every node in the cluster. Part of being the
"migration infrastructure", we also want authors of individual
migrations to be able to define arbitrary node-level operations to
execute on each node in the system.

To this end we introduce a Migration service, and populate it with the
two known RPCs the migration manager will want to depend on:

  • ValidateTargetClusterVersion: used to verify that the target node is
    running a binary that's able to support the given cluster version.
  • BumpClusterVersion: used to inform the target node about a (validated)
    cluster version bump.

Both these RPCs are not currently wired up to anything, and
BumpClusterVersion will be fleshed out just a tiny bit further in a
future PR, but they'll both be used to propagate cluster version bumps
across the crdb cluster through direct RPCs, supplanting our existing
gossip based distribution mechanism. This will let the migration manager
bump version gates in a more controlled fashion. See #56107 for what
that will end up looking like, and see the long-running migrations RFC
(#48843) for the motivation.

Like we mentioned earlier, we expect this service to pick up more RPCs
over time to service specific migrations.

Release note: None


Ignore the first four commits. They're from #56368 and #56474

@irfansharif irfansharif requested a review from a team as a code owner November 10, 2020 02:34
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@irfansharif irfansharif requested review from knz and tbg November 10, 2020 02:34
@irfansharif irfansharif force-pushed the 201109.migrations-service branch 3 times, most recently from 3991cde to 33c2588 Compare November 10, 2020 03:31
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Nov 10, 2020
...in favor of direct RPCs to all nodes in the cluster. It uses the
building blocks we've added thus far to replace the use of gossip to
disseminate the cluster version. It does so by sending out individual
RPCs to each node in the cluster, informing them of a version bump, all
the while retaining the same guarantees provided by our (now previously)
gossip-backed mechanism.

This diff has the following "pieces":
- We disconnect the version setting updates through gossip (by
  disconnecting the setting type within the updater process)
- We use the `Migration` service to send out RPCs to each node in the
  cluster, containing the payload that each node would otherwise receive
  through gossip. We do this by first introducing two primitives in
  pkg/migrations:
  - `RequiredNodes` retrieves a list of all nodes that are part of the
    cluster. It's powered by `pkg/../liveness`.
  - `EveryNode` is a shorthand that allows us to send out node-level
    migration RPCs to every node in the cluster.
  We combine these primitives with the RPCs introduced in cockroachdb#56476
  (`ValidateTargetClusterVersion`, `BumpClusterVersion`) to actually
  carry out the version bumps.
- We expand the `clusterversion.Handle` interface to allow setting the
  active version directly through it. We then make use of it in the
  implementation for `BumpClusterVersion`.
- Within `BumpClusterVersion`, we persists the cluster version received
  from the client node first, within `keys.StoreClusterVersionKey`, before
  bumping the version gate. This is a required invariant in the system
  in order for us to not regress our cluster version on restart. It was
  previously achieved by attaching callbacks on the version handle
  (`SetBeforeChange`).
- We no longer need the callbacks attached to gossip to persist cluster
  versions to disk. We're doing it as part of the `BumpClusterVersion`
  RPC. We remove them entirely.
- We use the active version provided by the join RPC to set the
  version setting directly (after having persisted it first). This too
  was previously achieved through gossip + the callback.

Release note: None
Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @knz, and @tbg)


pkg/server/migration.go, line 31 at r5 (raw file):

	server *Server

	// We used this mutex to serialize attempts to bump the cluster version.

We use


pkg/server/migration.go, line 48 at r5 (raw file):

	// We're validating the following:
	//
	//   node's minimum supported version <= target version <= node's binary version

Why does the min supported version even come into play? Aren't we also validating that we're not winding back the cluster version (I guess leaving it at the same place is necessary for idempotence), i.e. why isn't this

if targetVersion.Less(versionSetting.ActiveVersion())

The active version is >= the binary min supported, so this is just a stricter check.


pkg/server/migration.go, line 68 at r5 (raw file):

// BumpClusterVersion implements the MigrationServer interface. It's used to
// inform us a cluster version bump. Here we're responsible for durably

of a


pkg/server/migration.go, line 90 at r5 (raw file):

		}

		m.Lock()

Shouldn't BumpClusterVersion in its entirety be under this lock? prevCV might've changed by the time you get here.


pkg/server/migration.go, line 94 at r5 (raw file):

		// TODO(irfansharif): We should probably capture this pattern of
		// "persist the cluster version first" and only then bump the

Do we? This will be the only place that adjusts the cluster version, so keeping it explicitly as two steps carried out in a certain order (with a comment) seems totally fine.


pkg/server/migration.go, line 107 at r5 (raw file):

		// gate here. On 21.1 nodes we'll no longer be using gossip to propagate
		// cluster version bumps. We'll still have probably disseminate it
		// through gossip (do we actually have to?), but we won't listen to it.

I don't think we do! I mean, we kind of will anyway, at least as long as the version setting is hooked up and updated, simply because it's not worth avoiding to do so.


pkg/server/migration_test.go, line 37 at r5 (raw file):

		binaryMinSupportedVersion roachpb.Version
		targetVersion             roachpb.Version
		expErrMatch               string // Empty if expecting a nil error.

super nit: the old thing about lowercase and no punctuation in inline comments


pkg/server/serverpb/migration.proto, line 42 at r5 (raw file):

   // Specifically:
   //
   //   node's minimum supported version <= version <= node's binary version

see my other comment, this here may need an update as well if we change the impl


pkg/server/serverpb/migration.proto, line 53 at r5 (raw file):

   // that would be able to support the intended version bump.
   //
   // NB: We should note that this validation is a best-effort one. Albeit

This is why we discussed noop-versions before any real version. The first version bump will be into v21.2.0-1noop. If the 21.1 node sneaks in there, it won't do any harm because that version bump does by definition not change behavior in the 21.2 nodes. Any further bump will include the 21.1 node and will thus fail.
If the 21.1 node tries to show up after 21.2.0-1noop is active, it can't join.

If you don't mind, update that comment to reflect this.

This may also be a good place (for now) to write the comment about how the migration manager will load the list of nodes, roll out, load the list again, and roll out again if anything changed (until the node list stabilizes), which prevents a cluster version from becoming active without having been pushed to all nodes. This would reference the join RPC (which propagates the active version) and also the fact that the cluster version becomes "active" only after it has been "bumped" across the cluster using this RPC.

Perhaps the comment will eventually move elsewhere, but it does seem super relevant to write it early.

@tbg tbg self-requested a review November 10, 2020 09:43
Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1, 1 of 5 files at r2, 2 of 3 files at r3, 1 of 3 files at r4, 9 of 9 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @knz)

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no more comment besides what Tobias already provided.

Reviewed 2 of 3 files at r1, 1 of 5 files at r2, 2 of 3 files at r3, 9 of 9 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif)


pkg/server/migration_test.go, line 65 at r5 (raw file):

	}

	//   node's minimum supported version <= target version <= node's binary version

nit: comment should be a sentence.

@irfansharif irfansharif force-pushed the 201109.migrations-service branch from 33c2588 to 1624652 Compare November 11, 2020 02:04
Copy link
Copy Markdown
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews y'all.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @knz, and @tbg)


pkg/server/migration.go, line 31 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

We use

Done.


pkg/server/migration.go, line 48 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Why does the min supported version even come into play? Aren't we also validating that we're not winding back the cluster version (I guess leaving it at the same place is necessary for idempotence), i.e. why isn't this

if targetVersion.Less(versionSetting.ActiveVersion())

The active version is >= the binary min supported, so this is just a stricter check.

I don't really have a super good reason here I'll try anyway: I think my inclination here to think in terms of binary versions comes from the fact that the errors here, if any, are going to be bubbled up to the user directly during improper cluster upgrade attempts. I think during these attempts it's much more understandable to talk in terms of the binaries they're running (and not running), rather than the active version any given binary is running.

The structure here also resembles the validateBinaryVersions check we have for cluster version settings, and yes we're making sure that we're not regressing on the cluster version elsewhere.


pkg/server/migration.go, line 68 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

of a

Done.


pkg/server/migration.go, line 90 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Shouldn't BumpClusterVersion in its entirety be under this lock? prevCV might've changed by the time you get here.

Done.


pkg/server/migration.go, line 94 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Do we? This will be the only place that adjusts the cluster version, so keeping it explicitly as two steps carried out in a certain order (with a comment) seems totally fine.

It happens in two other spots during node start up:

  • when joining an existing cluster and receiving a target cluster version.
  • when bootstrapping, where we initialize at the binary version and also persist it.

I'm not sure that I'll do much here soon, but there's definitely some drive by clean up to be had.


pkg/server/migration.go, line 107 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I don't think we do! I mean, we kind of will anyway, at least as long as the version setting is hooked up and updated, simply because it's not worth avoiding to do so.

Yea, excited to see us introduce dedicated syntax for versions soon so we can strip it all out entirely.


pkg/server/migration_test.go, line 37 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

super nit: the old thing about lowercase and no punctuation in inline comments

Done.


pkg/server/serverpb/migration.proto, line 53 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This is why we discussed noop-versions before any real version. The first version bump will be into v21.2.0-1noop. If the 21.1 node sneaks in there, it won't do any harm because that version bump does by definition not change behavior in the 21.2 nodes. Any further bump will include the 21.1 node and will thus fail.
If the 21.1 node tries to show up after 21.2.0-1noop is active, it can't join.

If you don't mind, update that comment to reflect this.

This may also be a good place (for now) to write the comment about how the migration manager will load the list of nodes, roll out, load the list again, and roll out again if anything changed (until the node list stabilizes), which prevents a cluster version from becoming active without having been pushed to all nodes. This would reference the join RPC (which propagates the active version) and also the fact that the cluster version becomes "active" only after it has been "bumped" across the cluster using this RPC.

Perhaps the comment will eventually move elsewhere, but it does seem super relevant to write it early.

Done. Thanks for spelling it out, I feel like it's the third time this has clicked for me; I keep forgetting it.

The upcoming migration manager (prototyped in cockroachdb#56107) will want to
execute a few known RPCs on every node in the cluster. Part of being the
"migration infrastructure", we also want authors of individual
migrations to be able to define arbitrary node-level operations to
execute on each node in the system.

To this end we introduce a `Migration` service, and populate it with the
two known RPCs the migration manager will want to depend on:
- ValidateTargetClusterVersion: used to verify that the target node is
  running a binary that's able to support the given cluster version.
- BumpClusterVersion: used to inform the target node about a (validated)
  cluster version bump.

Both these RPCs are not currently wired up to anything, and
BumpClusterVersion will be fleshed out just a tiny bit further in a
future PR, but they'll both be used to propagate cluster version bumps
across the crdb cluster through direct RPCs, supplanting our existing
gossip based distribution mechanism. This will let the migration manager
bump version gates in a more controlled fashion. See cockroachdb#56107 for what
that will end up looking like, and see the long-running migrations RFC
(cockroachdb#48843) for the motivation.

Like we mentioned earlier, we expect this service to pick up more RPCs
over time to service specific migrations.

Release note: None
@irfansharif irfansharif force-pushed the 201109.migrations-service branch from 1624652 to 3ce0f35 Compare November 11, 2020 03:20
@irfansharif
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 11, 2020

Build succeeded:

@craig craig bot merged commit df21e50 into cockroachdb:master Nov 11, 2020
@irfansharif irfansharif deleted the 201109.migrations-service branch November 11, 2020 04:04
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Nov 11, 2020
...in favor of direct RPCs to all nodes in the cluster. It uses the
building blocks we've added thus far to replace the use of gossip to
disseminate the cluster version. It does so by sending out individual
RPCs to each node in the cluster, informing them of a version bump, all
the while retaining the same guarantees provided by our (now previously)
gossip-backed mechanism.

This diff has the following "pieces":
- We disconnect the version setting updates through gossip (by
  disconnecting the setting type within the updater process)
- We use the `Migration` service to send out RPCs to each node in the
  cluster, containing the payload that each node would otherwise receive
  through gossip. We do this by first introducing two primitives in
  pkg/migrations:
  - `RequiredNodes` retrieves a list of all nodes that are part of the
    cluster. It's powered by `pkg/../liveness`.
  - `EveryNode` is a shorthand that allows us to send out node-level
    migration RPCs to every node in the cluster.
  We combine these primitives with the RPCs introduced in cockroachdb#56476
  (`ValidateTargetClusterVersion`, `BumpClusterVersion`) to actually
  carry out the version bumps.
- We expand the `clusterversion.Handle` interface to allow setting the
  active version directly through it. We then make use of it in the
  implementation for `BumpClusterVersion`.
- Within `BumpClusterVersion`, we persists the cluster version received
  from the client node first, within `keys.StoreClusterVersionKey`, before
  bumping the version gate. This is a required invariant in the system
  in order for us to not regress our cluster version on restart. It was
  previously achieved by attaching callbacks on the version handle
  (`SetBeforeChange`).
- We no longer need the callbacks attached to gossip to persist cluster
  versions to disk. We're doing it as part of the `BumpClusterVersion`
  RPC. We remove them entirely.
- We use the active version provided by the join RPC to set the
  version setting directly (after having persisted it first). This too
  was previously achieved through gossip + the callback.

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Nov 11, 2020
...in favor of direct RPCs to all nodes in the cluster. It uses the
building blocks we've added thus far to replace the use of gossip to
disseminate the cluster version. It does so by sending out individual
RPCs to each node in the cluster, informing them of a version bump, all
the while retaining the same guarantees provided by our (now previously)
gossip-backed mechanism.

This diff has the following "pieces":
- We disconnect the version setting updates through gossip (by
  disconnecting the setting type within the updater process)
- We use the `Migration` service to send out RPCs to each node in the
  cluster, containing the payload that each node would otherwise receive
  through gossip. We do this by first introducing two primitives in
  pkg/migrations:
  - `RequiredNodes` retrieves a list of all nodes that are part of the
    cluster. It's powered by `pkg/../liveness`.
  - `EveryNode` is a shorthand that allows us to send out node-level
    migration RPCs to every node in the cluster.
  We combine these primitives with the RPCs introduced in cockroachdb#56476
  (`ValidateTargetClusterVersion`, `BumpClusterVersion`) to actually
  carry out the version bumps.
- We expand the `clusterversion.Handle` interface to allow setting the
  active version directly through it. We then make use of it in the
  implementation for `BumpClusterVersion`.
- Within `BumpClusterVersion`, we persists the cluster version received
  from the client node first, within `keys.StoreClusterVersionKey`, before
  bumping the version gate. This is a required invariant in the system
  in order for us to not regress our cluster version on restart. It was
  previously achieved by attaching callbacks on the version handle
  (`SetBeforeChange`).
- We no longer need the callbacks attached to gossip to persist cluster
  versions to disk. We're doing it as part of the `BumpClusterVersion`
  RPC. We remove them entirely.
- We use the active version provided by the join RPC to set the
  version setting directly (after having persisted it first). This too
  was previously achieved through gossip + the callback.

Release note: None
craig bot pushed a commit that referenced this pull request Nov 11, 2020
56480: settings,migration: disconnect cluster version from gossip r=irfansharif a=irfansharif

...in favor of direct RPCs to all nodes in the cluster. It uses the
building blocks we've added thus far to replace the use of gossip to
disseminate the cluster version. It does so by sending out individual
RPCs to each node in the cluster, informing them of a version bump, all
the while retaining the same guarantees provided by our (now previously)
gossip-backed mechanism. This is another in the series of PRs to
introduce long running migrations (#48843), pulled out of our original
prototype in #56107.

This diff has the following "pieces":
- We disconnect the version setting updates through gossip (by
  disconnecting the setting type within the updater process)
- We use the `Migration` service to send out RPCs to each node in the
  cluster, containing the payload that each node would otherwise receive
  through gossip. We do this by first introducing two primitives in
  pkg/migrations:
  - `RequiredNodes` retrieves a list of all nodes that are part of the
    cluster. It's powered by `pkg/../liveness`.
  - `EveryNode` is a shorthand that allows us to send out node-level
    migration RPCs to every node in the cluster.
  
  We combine these primitives with the RPCs introduced in #56476
  (`ValidateTargetClusterVersion`, `BumpClusterVersion`) to actually
  carry out the version bumps.
- We expand the `clusterversion.Handle` interface to allow setting the
  active version directly through it. We then make use of it in the
  implementation for `BumpClusterVersion`.
- Within `BumpClusterVersion`, we persists the cluster version received
  from the client node first, within `keys.StoreClusterVersionKey`, before
  bumping the version gate. This is a required invariant in the system
  in order for us to not regress our cluster version on restart. It was
  previously achieved by attaching callbacks on the version handle
  (`SetBeforeChange`).
- We no longer need the callbacks attached to gossip to persist cluster
  versions to disk. We're doing it as part of the `BumpClusterVersion`
  RPC. We remove them entirely.
- We use the active version provided by the join RPC to set the
  version setting directly (after having persisted it first). This too
  was previously achieved through gossip + the callback.

Release note: None

---

Only the last commit is of interest. All prior commits should be reviewed
across  #56476, #56474 and #56368.

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants