Skip to content

migrations part 1/2: add version cluster setting#17216

Merged
tbg merged 1 commit intocockroachdb:masterfrom
tbg:settings
Jul 31, 2017
Merged

migrations part 1/2: add version cluster setting#17216
tbg merged 1 commit intocockroachdb:masterfrom
tbg:settings

Conversation

@tbg
Copy link
Copy Markdown
Member

@tbg tbg commented Jul 25, 2017

Implement (hopefully) more than 50% of the migration RFC1. Specifically,

  • add a StatemachineSetting to the settings package
  • use it for SET CLUSTER SETTING version = '<major>.minor'
  • pass a newly created *ExposedClusterVersion around
  • use (*ExposedClusterVersion).IsActive(targetVersion) to toggle existing
    features that so far used the stub util.IsMigrated()
  • remove util.IsMigrated().

The missing part is incorporating the remaining parts of 2, that is, storing
the seen version to disk.

@tbg tbg added this to the 1.1 milestone Jul 25, 2017
@tbg tbg requested review from bdarnell and dt July 25, 2017 18:35
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@tbg tbg force-pushed the settings branch 3 times, most recently from a2fe5b1 to 5979ae8 Compare July 25, 2017 21:23
@petermattis
Copy link
Copy Markdown
Collaborator

Looking good. Some nits below and confusion about the <major>.<minor>.<unstable> syntax. Perhaps we should use <major>.<minor>-unstable<unstable>.


Review status: 0 of 32 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


pkg/base/cluster_version.go, line 43 at r1 (raw file):

	// VersionBase is the empty version and corresponds to any binary
	// older than 1.0.1, though these binaries won't know anything

This is mildly confusing. 1.0.1 doesn't correspond to the actual 1.0.1 release, right?


pkg/base/cluster_version.proto, line 32 at r1 (raw file):

  // Support for that functionality is guaranteed by the ratchet
  // of minimum_version.
  roachpb.Version use_version = 2 [(gogoproto.nullable) = false];

Bikeshed: should this be named inuse_version? The comment even uses the term in use.


pkg/migration/version.go, line 31 at r1 (raw file):

)

func parseAndValidateVersion(s string) (roachpb.Version, error) {

Should this method be roachpb.ParseVersion or something like that? Seems strange to have Version.String() in one package and the parsing in another.


pkg/migration/version.go, line 52 at r1 (raw file):

	c.Major = int32(ints[0])
	c.Minor = int32(ints[1])
	c.Unstable = int32(ints[2])

The Version proto has 4 fields, yet we only parse 3. I feel that this is part of my confusion mentioned in another comment.


pkg/settings/statemachine.go, line 62 at r1 (raw file):

// but not semantically. Users must call Validate with the authoritative
// previous state, the suggested state transition, and then overwrite the state
// machine with the result.

I haven't fully grok'ed the usage here, so apologies if this is misguided: did you consider adding a ValidationFn to the existing *Settings? I would think most of the above could be had by such a validation function hook. For the cluster version, I imagine a validation function on a StringSetting would be sufficient. It is very possible that I'm missing a constraint that is dictating this approach.


pkg/sql/show.go, line 96 at r1 (raw file):

					"SELECT value FROM system.settings WHERE name = $1",
					name,
				)

I thought SHOW CLUSTER SETTING was already a consistent read. @dt?


Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Jul 26, 2017

I like <maj>.<min>-unstable<x>. Will wait for some additional reviews before I get back to work on this.


Review status: 0 of 32 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


pkg/base/cluster_version.go, line 43 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This is mildly confusing. 1.0.1 doesn't correspond to the actual 1.0.1 release, right?

I think most of your confusion comes from the fact that version=major.minor.patch.unstable, but we always have patch==0 (RFC wants this) but don't want to print the patch version. I would be happy to turn the patch proto field into a reserved one and rip it out of the logic for now.


pkg/settings/statemachine.go, line 62 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I haven't fully grok'ed the usage here, so apologies if this is misguided: did you consider adding a ValidationFn to the existing *Settings? I would think most of the above could be had by such a validation function hook. For the cluster version, I imagine a validation function on a StringSetting would be sufficient. It is very possible that I'm missing a constraint that is dictating this approach.

a validation doesn't cut it because the new value is the result of the validation. You also need the input to always be consistent or your output won't be (potentially breaking monotonicity of min_version). Plus you have a diverging internal (what's in the table) and external representation (what's showed to the client) which needs to be translated both ways. I initially tried exactly what you're describing but, yeah, it's more complicated to the point where it's easier to just make a new setting.


pkg/sql/show.go, line 96 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I thought SHOW CLUSTER SETTING was already a consistent read. @dt?

You can see from the adjacent code that it isn't. It just reads the setting variable, but that isn't consistent. I think I would want to make show cluster setting X always read from the table, but wanted to keep the initial changes focused in case that's not desired.

I think though that if you show all cluster settings, you get a consistent read, maybe?

return p.delegateQuery(ctx, "SHOW CLUSTER SETTINGS",
			"TABLE crdb_internal.cluster_settings", nil, nil)

I also feel that I shouldn't be hooking into system.settings here but I think we're already below the level of abstraction at which crdb_internal is available.


Comments from Reviewable

@petermattis
Copy link
Copy Markdown
Collaborator

Review status: 0 of 32 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


pkg/settings/statemachine.go, line 62 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

a validation doesn't cut it because the new value is the result of the validation. You also need the input to always be consistent or your output won't be (potentially breaking monotonicity of min_version). Plus you have a diverging internal (what's in the table) and external representation (what's showed to the client) which needs to be translated both ways. I initially tried exactly what you're describing but, yeah, it's more complicated to the point where it's easier to just make a new setting.

Ack. I wonder if there is a better term for this than "state machine". Yeah, it is a state machine, but so are so many other things. I guess the term feels to general purpose.


pkg/sql/show.go, line 96 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

You can see from the adjacent code that it isn't. It just reads the setting variable, but that isn't consistent. I think I would want to make show cluster setting X always read from the table, but wanted to keep the initial changes focused in case that's not desired.

I think though that if you show all cluster settings, you get a consistent read, maybe?

return p.delegateQuery(ctx, "SHOW CLUSTER SETTINGS",
			"TABLE crdb_internal.cluster_settings", nil, nil)

I also feel that I shouldn't be hooking into system.settings here but I think we're already below the level of abstraction at which crdb_internal is available.

Yeah, I'd want show cluster setting X to always read from the table. @dt Any reason it isn't already doing that?


Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Jul 26, 2017

Review status: 0 of 32 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


pkg/settings/statemachine.go, line 62 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Ack. I wonder if there is a better term for this than "state machine". Yeah, it is a state machine, but so are so many other things. I guess the term feels to general purpose.

I find it pretty spot on and I don't like bikeshedding enough to participate, but if you come up with a stellar suggestion I'll be happy to rename.


pkg/sql/show.go, line 96 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Yeah, I'd want show cluster setting X to always read from the table. @dt Any reason it isn't already doing that?

There's a straightforward reason, which is that nothing is written to the table unless the default is changed. This is actually a problem I need to address, as every cluster should always have the version setting as a running node can't really guess a default. It could be made workable because all nodes that know about versioning could persist their running versions and use that as the default, but this only seems to invite further complication.

This means that cluster init has to populate at least that setting, at which point it might as well populate all of them.


Comments from Reviewable

@bdarnell
Copy link
Copy Markdown
Contributor

LGTM


Reviewed 35 of 35 files at r1.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


pkg/settings/statemachine.go, line 62 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

I find it pretty spot on and I don't like bikeshedding enough to participate, but if you come up with a stellar suggestion I'll be happy to rename.

I also don't like StatemachineSetting, but my best alternative, TransitionSetting, isn't very good either. I would, however, spell StatemachineSetting with a capital M.


Comments from Reviewable

@dt
Copy link
Copy Markdown
Contributor

dt commented Jul 27, 2017

:lgtm: (only looked at the settings part)


Review status: all files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.


pkg/settings/statemachine.go, line 31 at r1 (raw file):

// For (a nonsensical) example, a StatemachineSetting can be used to maintain an
// encoded protobuf containing an integer that the user can only increment by 3
// if the int is odd and by 2 if it is equal. More generally, the setting starts

nit: s/equal/even/ right?


pkg/settings/updater.go, line 58 at r1 (raw file):

// Set attempts to parse and update a setting and notes that it was updated.
func (u Updater) Set(key string, rawValue string, vt string) error {

huh, is this a style rule I missed?


Comments from Reviewable

@dt
Copy link
Copy Markdown
Contributor

dt commented Jul 27, 2017

Review status: all files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


pkg/sql/show.go, line 96 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

There's a straightforward reason, which is that nothing is written to the table unless the default is changed. This is actually a problem I need to address, as every cluster should always have the version setting as a running node can't really guess a default. It could be made workable because all nodes that know about versioning could persist their running versions and use that as the default, but this only seems to invite further complication.

This means that cluster init has to populate at least that setting, at which point it might as well populate all of them.

yeah, the settings design in the initial rfc basically sees the table as only for overrides of defaults:
The decision to only persist explicit overrides was to make it easier to tune defaults -- otherwise we'd need migrations to rewrite the settings table if we discover a better value for some default, making tuning more expensive (and we'd also need to, of course, track which settings had explicit vs default values to do that migration correctly).

select * from system.settings gets you a consistent read of the current non-default settings, but show cluster setting shows you whatever a node is currently using -- which is probably whatever its hardcoded default is, but might be its current, cached, override.


Comments from Reviewable

Implement (hopefully) more than 50% of the migration RFC[1]. Specifically,

- add a StateMachineSetting to the `settings` package
- use it for `SET CLUSTER SETTING version = '<major>.<minor>-<unstable>'`
- pass a newly created `*ExposedClusterVersion` around
- use `(*ExposedClusterVersion).IsActive(targetVersion)` to toggle existing
features that so far used the stub `util.IsMigrated()`
- remove `util.IsMigrated()`.

The missing part is incorporating the remaining parts of [2], that is, storing
the seen version to disk.

[1]: cockroachdb#16977
[2]: cockroachdb#17155
@tbg
Copy link
Copy Markdown
Member Author

tbg commented Jul 31, 2017

TFTRs! Merging on green.


Review status: 13 of 33 files reviewed at latest revision, 9 unresolved discussions, some commit checks pending.


pkg/base/cluster_version.proto, line 32 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Bikeshed: should this be named inuse_version? The comment even uses the term in use.

I'll leave as is for now, keeping the wording congruent with the RFC.


pkg/migration/version.go, line 31 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Should this method be roachpb.ParseVersion or something like that? Seems strange to have Version.String() in one package and the parsing in another.

Good point, done.


pkg/migration/version.go, line 52 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

The Version proto has 4 fields, yet we only parse 3. I feel that this is part of my confusion mentioned in another comment.

I changed the version to <major>.<minor>-<unstable> as you suggested. I think that works a lot better.


pkg/settings/statemachine.go, line 31 at r1 (raw file):

Previously, dt (David Taylor) wrote…

nit: s/equal/even/ right?

Yep, thanks. Fixed.


pkg/settings/statemachine.go, line 62 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I also don't like StatemachineSetting, but my best alternative, TransitionSetting, isn't very good either. I would, however, spell StatemachineSetting with a capital M.

StateMachineSetting it is (until anyone feels strongly enough and swoops in for a rename, anyway).


pkg/settings/updater.go, line 58 at r1 (raw file):

Previously, dt (David Taylor) wrote…

huh, is this a style rule I missed?

Nope, just a spurious diff. Reverted.


pkg/sql/show.go, line 96 at r1 (raw file):

Previously, dt (David Taylor) wrote…

yeah, the settings design in the initial rfc basically sees the table as only for overrides of defaults:
The decision to only persist explicit overrides was to make it easier to tune defaults -- otherwise we'd need migrations to rewrite the settings table if we discover a better value for some default, making tuning more expensive (and we'd also need to, of course, track which settings had explicit vs default values to do that migration correctly).

select * from system.settings gets you a consistent read of the current non-default settings, but show cluster setting shows you whatever a node is currently using -- which is probably whatever its hardcoded default is, but might be its current, cached, override.

That distinction between system.settings and SHOW CLUSTER SETTINGS makes some sense, thanks @dt!
I'll punt on addressing this fully in this PR. @m-schneider and I will look at how to go about the default setting separately. Seems relatively clear to me that we'll just want to make sure there's always a persisted cluster version. What we'll have to figure out is if we're going to change anything beyond that.


Comments from Reviewable

@dt
Copy link
Copy Markdown
Contributor

dt commented Jul 31, 2017

Review status: 13 of 33 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


pkg/sql/show.go, line 96 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

That distinction between system.settings and SHOW CLUSTER SETTINGS makes some sense, thanks @dt!
I'll punt on addressing this fully in this PR. @m-schneider and I will look at how to go about the default setting separately. Seems relatively clear to me that we'll just want to make sure there's always a persisted cluster version. What we'll have to figure out is if we're going to change anything beyond that.

For the diagnostic reporting setting, semantically we wanted a default of true, but since settings need to be allowed to return defaults until they are read and didn't want to risk ignoring an explicit false and betraying user trust, we went with a default of false with a startup migration writing an explicit true (optInToDiagnosticsStatReporting in migrations.go). This feels pretty similar.


Comments from Reviewable

@tbg tbg merged commit fcbbb9a into cockroachdb:master Jul 31, 2017
@tbg tbg deleted the settings branch July 31, 2017 16:02
tbg added a commit to tbg/cockroach that referenced this pull request Jul 31, 2017
The packages were rearranged in

cockroachdb#17216
@a-robinson
Copy link
Copy Markdown
Contributor

Does ExposedClusterVersion have to be part of a StoreConfig rather than a package-scoped variable? It's going to be fairly awkward to plumb into the allocator scorer methods.

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Aug 2, 2017

I responded to this offline, but for posterity: I feel fairly strongly that the less singletons, the better. The cluster version setting var is a particularly annoying, because due to it it's impossible to run mixed version cluster tests unless we're going straight for acceptance tests (in which case we have to juggle binary versions, etc, and Docker pain). Plumbing can certainly be annoying, but I think it's the lesser of two evils almost always, and there are some ways to make it more pleasant which I'm happy to contribute to moving forward.

irfansharif added a commit to irfansharif/cockroach that referenced this pull request Oct 26, 2020
We introduced the custom StateMachineSetting type in cockroachdb#17216 to power the
underlying machinery for version upgrades:

  SET CLUSTER SETTING version = <major>-<minor>;

At the time we left it generalizable enough to make room for future
settings with arbitrary internal transitions. For cluster versions this
meant only allowing transitions from one major version to the immediate
next, all the while making sure that each binary in the cluster was able
to activate the targeted version (the proposed version fell within the
allowable range ["binary min supported version", "binary version"]).

In the three years since we haven't added any state-machine validated
settings that fit the bill, and the layers of abstractions to get to
validated cluster version updates are one too many for (my) comfort.
This PR introduces a new VersionSetting type that is more tightly
coupled with the ClusterVersion type itself.

VersionSetting uses language that only appropriate for cluster versions,
hopefully clarifying things as we go. We'll depend on this clarification
in future PRs when we remove the use of gossip in disseminating cluster
version bumps.

Release note (sql/cli change): The underlying type for the version
cluster setting has been changed. Previously it was of an internal type
representing "state machine", but now it's simply "version". This has no
operational implications, but it does reflect differently in a few
spots:
  - The `Type` column in `cockroach gen settings-list` will now show
    "version" instead of "custom validation"
  - The `setting_type` column for `version` in `SHOW CLUSTER SETTINGS`
    will now show a "v" instead of an "m"
  - The `valueType` column for `version` in `system.settings` will now
    show a "v" instead of an "m"
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Oct 27, 2020
We introduced the custom StateMachineSetting type in cockroachdb#17216 to power the
underlying machinery for version upgrades:

  SET CLUSTER SETTING version = <major>-<minor>;

At the time we left it generalizable enough to make room for future
settings with arbitrary internal transitions. For cluster versions this
meant only allowing transitions from one major version to the immediate
next, all the while making sure that each binary in the cluster was able
to activate the targeted version (the proposed version fell within the
allowable range ["binary min supported version", "binary version"]).

In the three years since we haven't added any state-machine validated
settings that fit the bill, and the layers of abstractions to get to
validated cluster version updates are one too many for (my) comfort.
This PR introduces a new VersionSetting type that is more tightly
coupled with the ClusterVersion type itself.

VersionSetting uses language that only appropriate for cluster versions,
hopefully clarifying things as we go. We'll depend on this clarification
in future PRs when we remove the use of gossip in disseminating cluster
version bumps.

Release note (sql/cli change): The underlying type for the version
cluster setting has been changed. Previously it was of an internal type
representing "state machine", but now it's simply "version". This has no
operational implications, but it does reflect differently in a few
spots:
  - The `Type` column in `cockroach gen settings-list` will now show
    "version" instead of "custom validation"
  - The `setting_type` column for `version` in `SHOW CLUSTER SETTINGS`
    will now show a "v" instead of an "m"
  - The `valueType` column for `version` in `system.settings` will now
    show a "v" instead of an "m"
craig bot pushed a commit that referenced this pull request Oct 27, 2020
55877: rowflow: account for more memory usage of the row buffer r=yuzefovich a=yuzefovich

Previously, we accounted for the memory usage of the row buffer from
which the rows are pushed from in the routers only when copying the rows
from the row container, but we also have another in-memory row buffer
that we can move the rows from which was missing the memory accounting.
This is now fixed which also deflakes the router disk spill test.

Fixes: #55848.

Release note: None

55907: sql: remove vectorize=201auto option r=yuzefovich a=yuzefovich

`201auto` option of `vectorize` setting has been removed since it no
longer makes sense to keep (it was introduced as an escape hatch for
20.2 release). Note that we don't need to bump the distsql version
because of this change since it is backwards compatible - if the gateway
is running the old version that has `vectorize=201auto` set, then we
will check whether flows for all nodes don't have non-streaming
operators and possibly use `off` option on the flow setup request, then
the newer version remote node will check the vectorize setting on the
request whether it is not `off` and setup the vectorized flow if it is
not.

Release note (sql change): `201auto` value for `vectorize` session
variable and the corresponding cluster setting has been removed.

55982: roachtest: use IMPORT for all TPC-C tests r=nvanbenschoten a=nvanbenschoten

Fixes #55541.
Fixes #55580.

In #54880 and #55050, we switched to using IMPORT for most TPC-C tests,
in favor of BACKUP, which requires fixtures to be regenerated whenever
we changed the workload. In this PR, we switch over the remaining uses
of `fixtures load tpcc` to `fixtures import tpcc` to avoid compatibility
issues on older release branches.

55994: settings: delete StateMachineSetting, introduce VersionSetting r=irfansharif a=irfansharif

We introduced the custom StateMachineSetting type in #17216 to power the
underlying machinery for version upgrades:

  SET CLUSTER SETTING version = <major>-<minor>;

At the time we left it generalizable enough to make room for future
settings with arbitrary internal transitions. For cluster versions this
meant only allowing transitions from one major version to the immediate
next, all the while making sure that each binary in the cluster was able
to activate the targeted version (the proposed version fell within the
allowable range ["binary min supported version", "binary version"]).

In the three years since we haven't added any state-machine validated
settings that fit the bill, and the layers of abstractions to get to
validated cluster version updates are one too many for (my) comfort.
This PR introduces a new VersionSetting type that is more tightly
coupled with the ClusterVersion type itself.

VersionSetting uses language that only appropriate for cluster versions,
hopefully clarifying things as we go. We'll depend on this clarification
in future PRs when we remove the use of gossip in disseminating cluster
version bumps.

Release note (sql/cli change): The underlying type for the version
cluster setting has been changed. Previously it was of an internal type
representing "state machine", but now it's simply "version". This has no
operational implications, but it does reflect differently in a few
spots:
  - The `Type` column in `cockroach gen settings-list` will now show
    "version" instead of "custom validation"
  - The `setting_type` column for `version` in `SHOW CLUSTER SETTINGS`
    will now show a "v" instead of an "m"
  - The `valueType` column for `version` in `system.settings` will now
    show a "v" instead of an "m"


56006: logictest: deflake a test r=yuzefovich a=yuzefovich

We've recently merged a test that in very rare circumstances could
produce a float result that differs from the expected one by 1 in the
15th significant digit (after rounding). I believe that could occur,
e.g. when the 15th and 16th significant digits were `35`, and we matched
the spec of supporting 15 significant digits for floats, yet the
rounding makes us return an unexpected result. This commit rounds to the
precision of 1 digit less which should make the test non-flaky.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants