Skip to content

sql, config: make voter_constraints inherit from parent by default#63079

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
aayushshah15:handle-voter-constraints-legacy-zone-configs
Apr 14, 2021
Merged

sql, config: make voter_constraints inherit from parent by default#63079
craig[bot] merged 2 commits intocockroachdb:masterfrom
aayushshah15:handle-voter-constraints-legacy-zone-configs

Conversation

@aayushshah15
Copy link
Copy Markdown
Contributor

@aayushshah15 aayushshah15 commented Apr 5, 2021

Resolves #61316

This commit inverts the definition of the flag used to keep track of
whether voter_constraints were empty or inherited. This allows legacy
(pre-21.1) zone configs to be correctly interpreted by 21.1 nodes.

Release note (backward-incompatible change): the internal representation
of the voter_constraints zone configuration attribute (new in 21.1)
has been altered in a way that's partially incompatible with the
representation used by previous 21.1 betas (and the alphas that include
this attribute). This means that Users who directly set the
voter_constraints attribute to an empty list will lose those
constraints and will have to reset them.

@aayushshah15 aayushshah15 requested a review from a team as a code owner April 5, 2021 11:13
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@aayushshah15 aayushshah15 removed the request for review from a team April 5, 2021 11:14
@otan
Copy link
Copy Markdown
Contributor

otan commented Apr 5, 2021

i think this is backwards incompatible with the betas, which according to @ajstorm is something we care about :\
would it be easier to write the migrations?

@ajstorm
Copy link
Copy Markdown
Collaborator

ajstorm commented Apr 5, 2021

Nice to see this @aayushshah15. Can you talk a bit more about how it's backwards incompatible?

@otan
Copy link
Copy Markdown
Contributor

otan commented Apr 5, 2021

inherited_voter_constraints has been renamed and re-used as explicitly_set_voter_constraints in the protobuf. anyone who has used voter constraints in the past will have the meaning "swapped".

@aayushshah15
Copy link
Copy Markdown
Contributor Author

aayushshah15 commented Apr 5, 2021

i think this is backwards incompatible with the betas, which according to @ajstorm is something we care about

Is there a source for this on our wiki / somewhere else? I can't find any doc claiming this, but I can find some old slack conversations about how this is more of a best-effort thing.

Turns out we do care about this.

I discussed with Nathan and Andrei just now and we felt that, in this particular case, not writing a migration doesn't lead to any issues severe enough (since it just amounts to users essentially losing their existing voter_constraints) to warrant the extra work. Additionally, in this case its unclear exactly what the migration should do, since we're dealing with zone config hierarchies that might already be in an incorrect state.

@otan
Copy link
Copy Markdown
Contributor

otan commented Apr 6, 2021

if i understand correctly, would the migration be to set all InheritedVoterConstraints to true for every zone config that exists?
i sense some surprises if this is to be the case and the migration is not done, e.g. a user won't be able to migrate to MR if they set gc.ttlseconds on their table unless they discard the zone config (since we dont have CONFIGURE ZONE USING x = INHERIT FROM PARENT)

@aayushshah15
Copy link
Copy Markdown
Contributor Author

if i understand correctly, would the migration be to set all InheritedVoterConstraints to true for every zone config that exists?

Nope, this would essentially amount to wiping all voter_constraints on every zone.

I haven't thought through this deeply but any migration would be quite tricky to get right here.

user won't be able to migrate to MR if they set gc.ttlseconds on their table

I don't think I follow, why would gc.ttlseconds matter here?

@otan
Copy link
Copy Markdown
Contributor

otan commented Apr 6, 2021

It initialises a new zone config but with the default values for voter constraints in v20.2. If a user migrates to v21.1, on the first multi-region add, we'll detect inherit voter constraints = false on the zone config which is not expected and would hence be blocked

@aayushshah15
Copy link
Copy Markdown
Contributor Author

aayushshah15 commented Apr 6, 2021

@otan I see what you mean. That's what this PR is fixing though, right? If all we're concerned about is the move from 20.2 to 21.1.0 (beta3 and above) then this PR should be all we need.

Could you clarify what you're proposing, just so we're on the same page? What CRDB versions are we proposing to reconcile with a migration? 21.1 beta1 to beta3?

@aayushshah15
Copy link
Copy Markdown
Contributor Author

Let me clarify what I'm thinking.

If we care about not breaking existing zone configs for existing 21.1 beta users in addition to not breaking voter_constraints inheritance for 20.2 users, a simple migration for just 20.2 -> 21.1 will not cut it anymore because we've shipped a couple 21.1 betas already.

So, if I'm reading you correctly, your proposal is two sets of migrations: one that deals with 20.2 users moving to "21.1beta3 and above", and another that deals with 21.1beta1 users to "21.1beta3 and above". The latter is the reason for the push-back against this PR, is that correct? Note that this is the migration case that's very tricky cause the users might have already created zone configs with explicitly not inherited voter constraints.

@otan
Copy link
Copy Markdown
Contributor

otan commented Apr 6, 2021

Hmm, sorry let's try again!

I discussed with Nathan and Andrei just now and we felt that, in this particular case, not writing a migration doesn't lead to any issues severe enough (since it just amounts to users essentially losing their existing voter_constraints) to warrant the extra work.

Somehow connected the wrong dots and thought you meant "we're cancelling this PR and doing nothing". My bad.

The latter is the reason for the push-back against this PR, is that correct?

Yep, that's my pushback on this PR. Anyone who's already in 21.1beta1 (or whatever is out there as of speaking) will have the behaviour surprisingly flip on them with this PR merged if they moved to say v21.1.0.

If we care about not breaking existing zone configs for existing 21.1 beta users in addition to not breaking voter_constraints inheritance for 20.2 users, a simple migration for just 20.2 -> 21.1 will not cut it anymore because we've shipped a couple 21.1 betas already.

Ah yes, that's tricky - didn't think about that.

I guess my question then is can we do something simple to check like

if !zc.InheritParentZoneConfigs && len(voter_constraints) == 0 {
   zc.InheritParentZoneConfigs = true
}

(I would've thought if not inheriting then you must have some voter constraints, but I could be wrong).

Is there a way to check in a migration whether a cluster has ever been a certain version? I would've hope so, but realise I am asking for a lot more than is bargained for. Supporting compat between betas is really gross :\


In either case, both situations suck, but it seems like you have been weighing the options and I chimed in with insufficient context. I guess a release note summarising what could go wrong between betas would be nice to have.

@aayushshah15
Copy link
Copy Markdown
Contributor Author

"we're cancelling this PR and doing nothing"

yeah so just to confirm, my / our position was that we would go ahead with this PR since it's a very simple fix for the main issue that we care about, which is correct interpretation of 20.2 zone configs. Regarding the fact that we're mangling with zone configs from 21.1beta1, we would just call this out in a proper release note since it's not worth taking on all this complexity for a tiny set of non-production users (especially when the mitigation is to just discard some existing zone configs and re-add them -- the release note could call this out).

I guess my question then is can we do something simple to check like

I had wondered about this but it would be incorrect because the user could've just explicitly set voter_constraints to an empty list (a good reason to do that would be to avoid inheriting from parent). I'm actually not sure there even is a correct migration we can write here. This kinda brings me to my point below.

Supporting compat between betas is really gross :\

It's just that there's a ton of ways to get this whole ordeal wrong, all for the sake of a vague desire to maintain back-compat on new features between betas (which we don't seem to promise anywhere externally) which seems questionable.

@otan
Copy link
Copy Markdown
Contributor

otan commented Apr 6, 2021

I'm personally ok with going forwards with this PR, but also impartial to writing a migration that works only for 20.2->21.1 if we're going down this path :)

I guess it's not me the rules should be enforced by :D

@ajstorm
Copy link
Copy Markdown
Collaborator

ajstorm commented Apr 6, 2021 via email

@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch 4 times, most recently from d87c43a to f1a3e46 Compare April 11, 2021 04:13
@aayushshah15 aayushshah15 requested a review from a team as a code owner April 11, 2021 04:13
@aayushshah15 aayushshah15 removed the request for review from a team April 11, 2021 04:13
@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch from f1a3e46 to 7c07e48 Compare April 11, 2021 04:21
@aayushshah15
Copy link
Copy Markdown
Contributor Author

I've updated the release note with steps for current 21.1 beta users who upgrade to the next beta (here's a script that encapsulates the validation of these steps on my end). PTAL when you get a chance.

@aayushshah15 aayushshah15 requested review from ajstorm, nvb and otan April 11, 2021 04:34
Copy link
Copy Markdown
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing something obvious, but I don't see the updated release notes. @aayushshah15 can you point me to them.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @otan)

@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch from 7c07e48 to b86a79b Compare April 12, 2021 17:12
@aayushshah15
Copy link
Copy Markdown
Contributor Author

I had only updated the PR message, but now I updated the commit message as well.

Copy link
Copy Markdown
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, letting a TL sign off for the beta incompatibility though.

Reviewed 3 of 3 files at r2, 12 of 12 files at r3, 12 of 12 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch from b86a79b to e51faec Compare April 13, 2021 05:43
@blathers-crl blathers-crl bot requested a review from otan April 13, 2021 05:43
@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch 2 times, most recently from b9a12aa to b4a2fe2 Compare April 13, 2021 06:09
@aayushshah15
Copy link
Copy Markdown
Contributor Author

@nvanbenschoten proposed an alternative approach today, which works out fantastically. The solution in the current revision does not break backwards compatibility with existing multi-region databases. It only breaks back-compat with 21.1 zone configs that explicitly set voter_constraints to an empty list. Even then, the mitigation for that subset of users is very straightforward.

Copy link
Copy Markdown
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 it's great to see that ingenuity won out here. Nice job team!

Nit: You might want to consider dropping the last two bullets of the release note. I think if we describe who's impacted, we can leave out who's not impacted. Also, you might want to change the first bullet to "Users who directly set the voter_constraints attribute (a new attribute in 21.1) to an empty list in one of the previous 21.1 betas or alphas will lose those constraints and will have to reset them to an empty list". Just a small tweak to clearly indicate that this only impacts previous beta/alpha users.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @otan)

@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch from b4a2fe2 to 081bd22 Compare April 13, 2021 19:49
@aayushshah15
Copy link
Copy Markdown
Contributor Author

Thanks for the latest suggestions @ajstorm. Updated the note with them.

Copy link
Copy Markdown
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nit: Now that there's only one item in the list, you might want to remove the colon and just continue the paragraph.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @otan)

@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch from 081bd22 to ea5975f Compare April 13, 2021 20:13
Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: thanks for diligently working through the complexity of this change. I think it ended up in a good spot.

Reviewed 2 of 2 files at r1, 3 of 3 files at r2, 7 of 12 files at r3, 12 of 12 files at r4, 2 of 12 files at r5, 11 of 11 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)


pkg/config/zonepb/zone.go, line 278 at r6 (raw file):

// explicitly set on this zone or if it is to be inherited from its parent.
func (z *ZoneConfig) InheritedVoterConstraints() bool {
	if len(z.VoterConstraints) == 0 {

nit: this might be easier to read as return len(z.VoterConstraints) == 0 && !z.NullVoterConstraintsIsEmpty


pkg/config/zonepb/zone.go, line 554 at r6 (raw file):

		if !parent.InheritedVoterConstraints() {
			z.VoterConstraints = parent.VoterConstraints
			z.NullVoterConstraintsIsEmpty = true

Should this be z.NullVoterConstraintsIsEmpty = parent.NullVoterConstraintsIsEmpty? I don't think it matters in practice, but it seems like we're copying everything from our parent, so why not this field?


pkg/config/zonepb/zone.proto, line 141 at r6 (raw file):

  // below) because the non-nullable nature of `constraints` and
  // `voter_constraints` means that there is no other way to disambiguate
  // between an unset `constraints` attribute and an empty one.

voter_constraints


pkg/config/zonepb/zone.proto, line 145 at r6 (raw file):

  // NullVoterConstraintsIsEmpty specifies whether the VoterConstraints field
  // was explicitly set to be empty or if it was inherited from its parent.

I'd say again here what you say right above. "This field is needed because the non-nullable nature of voter_constraints means that there is no other way to disambiguate between an unset constraints attribute and an empty one."


pkg/server/diagnostics/reporter.go, line 378 at r6 (raw file):

	}
	dst.VoterConstraints = make([]zonepb.ConstraintsConjunction, len(src.VoterConstraints))
	dst.NullVoterConstraintsIsEmpty = src.NullVoterConstraintsIsEmpty

Should we be doing the same thing for InheritedConstraints up above?

@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch from ea5975f to cd1efa2 Compare April 14, 2021 17:39
This commit anonymizes voter_constraints info in our server diagnostics
reporting.

Release note: None
Resolves cockroachdb#61316

This commit changes the definition of the flag used to keep track of
whether `voter_constraints` were empty or inherited. This allows legacy
(pre-21.1) zone configs to be correctly interpreted by 21.1 nodes.

Release note (backward-incompatible change): the internal representation
of the `voter_constraints` zone configuration attribute (new in 21.1)
has been altered in a way that's partially incompatible with the
representation used by previous 21.1 betas (and the alphas that include
this attribute). This means that Users who directly set the
`voter_constraints` attribute to an empty list _will lose_ those
constraints and will have to reset them.
@aayushshah15 aayushshah15 force-pushed the handle-voter-constraints-legacy-zone-configs branch from cd1efa2 to f361004 Compare April 14, 2021 17:53
Copy link
Copy Markdown
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @otan)


pkg/config/zonepb/zone.go, line 278 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: this might be easier to read as return len(z.VoterConstraints) == 0 && !z.NullVoterConstraintsIsEmpty

Done.


pkg/config/zonepb/zone.go, line 554 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this be z.NullVoterConstraintsIsEmpty = parent.NullVoterConstraintsIsEmpty? I don't think it matters in practice, but it seems like we're copying everything from our parent, so why not this field?

Done.


pkg/config/zonepb/zone.proto, line 141 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

voter_constraints

Fixed.


pkg/config/zonepb/zone.proto, line 145 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'd say again here what you say right above. "This field is needed because the non-nullable nature of voter_constraints means that there is no other way to disambiguate between an unset constraints attribute and an empty one."

Done.


pkg/server/diagnostics/reporter.go, line 378 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should we be doing the same thing for InheritedConstraints up above?

It definitely seems like an oversight (though I think we're just dumping these anonymized configs somewhere for ~telemetry reasons), fixed for InheritedConstraints as well as InheritedLeasePreferences.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 14, 2021

Build succeeded:

@craig craig bot merged commit e978edd into cockroachdb:master Apr 14, 2021
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Apr 15, 2021
A previous change (cockroachdb#63079) made this test flakey by (needlessly) making
one of the marshalled fields a value derived from two other fields
(as opposed to just one). This commit fixes the flake.

Release note: None
craig bot pushed a commit that referenced this pull request Apr 15, 2021
63342: ui: add button to reset SQL stats r=Azhng a=Azhng

Release note (ui change): User can now reset SQL stats from DB Console.

Follow up to: cockroachdb/ui#271
Closes #33315 

63401: ui: e2e tests, improved cypress test for tsx, overview and db page r=nanduu04 a=nanduu04

Previously, only visual regression tests were written

Wrote cypress tests using the cypress-testing-library

[CRDB-2388](https://cockroachlabs.atlassian.net/browse/CRDB-2388?atlOrigin=eyJpIjoiOWIzNzY0ZjkzY2RhNGU3ZTgyZTUyOTY5MjE4ZmM5YmIiLCJwIjoiaiJ9)

Release note: None

63713: config: deflake TestMarshalableZoneConfigRoundTrip r=aayushshah15 a=aayushshah15

A previous change (#63079) made this test flakey by (needlessly) making
one of the marshalled fields a value derived from two other fields
(as opposed to just one). This commit fixes the flake.

Release note: None

63726: install: fix log flags in roachprod start r=knz a=tbg

I broke this in #63472, twofold.

1. if the version switch returned false, we weren't passing log flags
   at all, so there weren't any log files.
2. the version switch returned false on current master, since 21.1 is
   not tagged yet (we're still in the prerelease versions, currently
   at 21.1.0-alpha.3).

I tested that this fix works.

Release note: None


Co-authored-by: Azhng <archer.xn@gmail.com>
Co-authored-by: Nandu Pokhrel <nandup@cockroachlabs.com>
Co-authored-by: Aayush Shah <aayush.shah15@gmail.com>
Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql, config: incorrect handling of pre-21.1 zone configs

5 participants