sql, config: make voter_constraints inherit from parent by default#63079
Conversation
|
i think this is backwards incompatible with the betas, which according to @ajstorm is something we care about :\ |
|
Nice to see this @aayushshah15. Can you talk a bit more about how it's backwards incompatible? |
|
|
Turns out we do care about this. I discussed with Nathan and Andrei just now and we felt that, in this particular case, not writing a migration doesn't lead to any issues severe enough (since it just amounts to users essentially losing their existing |
|
if i understand correctly, would the migration be to set all InheritedVoterConstraints to true for every zone config that exists? |
Nope, this would essentially amount to wiping all I haven't thought through this deeply but any migration would be quite tricky to get right here.
I don't think I follow, why would |
|
It initialises a new zone config but with the default values for voter constraints in v20.2. If a user migrates to v21.1, on the first multi-region add, we'll detect inherit voter constraints = false on the zone config which is not expected and would hence be blocked |
|
@otan I see what you mean. That's what this PR is fixing though, right? If all we're concerned about is the move from 20.2 to 21.1.0 (beta3 and above) then this PR should be all we need. Could you clarify what you're proposing, just so we're on the same page? What CRDB versions are we proposing to reconcile with a migration? 21.1 beta1 to beta3? |
|
Let me clarify what I'm thinking. If we care about not breaking existing zone configs for existing 21.1 beta users in addition to not breaking So, if I'm reading you correctly, your proposal is two sets of migrations: one that deals with 20.2 users moving to "21.1beta3 and above", and another that deals with 21.1beta1 users to "21.1beta3 and above". The latter is the reason for the push-back against this PR, is that correct? Note that this is the migration case that's very tricky cause the users might have already created zone configs with explicitly not inherited voter constraints. |
|
Hmm, sorry let's try again!
Somehow connected the wrong dots and thought you meant "we're cancelling this PR and doing nothing". My bad.
Yep, that's my pushback on this PR. Anyone who's already in 21.1beta1 (or whatever is out there as of speaking) will have the behaviour surprisingly flip on them with this PR merged if they moved to say v21.1.0.
Ah yes, that's tricky - didn't think about that. I guess my question then is can we do something simple to check like (I would've thought if not inheriting then you must have some voter constraints, but I could be wrong). Is there a way to check in a migration whether a cluster has ever been a certain version? I would've hope so, but realise I am asking for a lot more than is bargained for. Supporting compat between betas is really gross :\ In either case, both situations suck, but it seems like you have been weighing the options and I chimed in with insufficient context. I guess a release note summarising what could go wrong between betas would be nice to have. |
yeah so just to confirm, my / our position was that we would go ahead with this PR since it's a very simple fix for the main issue that we care about, which is correct interpretation of 20.2 zone configs. Regarding the fact that we're mangling with zone configs from 21.1beta1, we would just call this out in a proper release note since it's not worth taking on all this complexity for a tiny set of non-production users (especially when the mitigation is to just discard some existing zone configs and re-add them -- the release note could call this out).
I had wondered about this but it would be incorrect because the user could've just explicitly set
It's just that there's a ton of ways to get this whole ordeal wrong, all for the sake of a vague desire to maintain back-compat on new features between betas (which we don't seem to promise anywhere externally) which seems questionable. |
|
I'm personally ok with going forwards with this PR, but also impartial to writing a migration that works only for 20.2->21.1 if we're going down this path :) I guess it's not me the rules should be enforced by :D |
|
The resolution from the team meeting today was:
1) We need a path for customers to migrate from previous betas to the beta
which contains this fix. That migration doesn't need to be handled in the
code, but can be prescribed in a release note as a set of steps the
customer needs to run. We should test this in-house to ensure that it
works, and that we don't get tripped up by any validation scenarios.
2) We need better clarity on what types of backward incompatible changes
are acceptable in betas. I'll continue working to ensure that Isaac and
Nate define this for future releases.
…On Tue, Apr 6, 2021 at 4:23 PM Oliver Tan ***@***.***> wrote:
I'm personally ok with going forwards with this PR, but also impartial to
writing a migration that works only for 20.2->21.1 if we're going down this
path :)
I guess it's not me the rules should be enforced by :D
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#63079 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEMXVOQWJK2RQAA7F6CMQ2TTHNUT5ANCNFSM42MURSUA>
.
|
d87c43a to
f1a3e46
Compare
f1a3e46 to
7c07e48
Compare
|
I've updated the release note with steps for current 21.1 beta users who upgrade to the next beta (here's a script that encapsulates the validation of these steps on my end). PTAL when you get a chance. |
ajstorm
left a comment
There was a problem hiding this comment.
I'm probably missing something obvious, but I don't see the updated release notes. @aayushshah15 can you point me to them.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @otan)
7c07e48 to
b86a79b
Compare
|
I had only updated the PR message, but now I updated the commit message as well. |
otan
left a comment
There was a problem hiding this comment.
Changes LGTM, letting a TL sign off for the beta incompatibility though.
Reviewed 3 of 3 files at r2, 12 of 12 files at r3, 12 of 12 files at r5.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
b86a79b to
e51faec
Compare
b9a12aa to
b4a2fe2
Compare
|
@nvanbenschoten proposed an alternative approach today, which works out fantastically. The solution in the current revision does not break backwards compatibility with existing multi-region databases. It only breaks back-compat with 21.1 zone configs that explicitly set |
ajstorm
left a comment
There was a problem hiding this comment.
🎉 it's great to see that ingenuity won out here. Nice job team!
Nit: You might want to consider dropping the last two bullets of the release note. I think if we describe who's impacted, we can leave out who's not impacted. Also, you might want to change the first bullet to "Users who directly set the voter_constraints attribute (a new attribute in 21.1) to an empty list in one of the previous 21.1 betas or alphas will lose those constraints and will have to reset them to an empty list". Just a small tweak to clearly indicate that this only impacts previous beta/alpha users.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @otan)
b4a2fe2 to
081bd22
Compare
|
Thanks for the latest suggestions @ajstorm. Updated the note with them. |
ajstorm
left a comment
There was a problem hiding this comment.
LGTM. Nit: Now that there's only one item in the list, you might want to remove the colon and just continue the paragraph.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @otan)
081bd22 to
ea5975f
Compare
nvb
left a comment
There was a problem hiding this comment.
thanks for diligently working through the complexity of this change. I think it ended up in a good spot.
Reviewed 2 of 2 files at r1, 3 of 3 files at r2, 7 of 12 files at r3, 12 of 12 files at r4, 2 of 12 files at r5, 11 of 11 files at r6.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)
pkg/config/zonepb/zone.go, line 278 at r6 (raw file):
// explicitly set on this zone or if it is to be inherited from its parent. func (z *ZoneConfig) InheritedVoterConstraints() bool { if len(z.VoterConstraints) == 0 {
nit: this might be easier to read as return len(z.VoterConstraints) == 0 && !z.NullVoterConstraintsIsEmpty
pkg/config/zonepb/zone.go, line 554 at r6 (raw file):
if !parent.InheritedVoterConstraints() { z.VoterConstraints = parent.VoterConstraints z.NullVoterConstraintsIsEmpty = true
Should this be z.NullVoterConstraintsIsEmpty = parent.NullVoterConstraintsIsEmpty? I don't think it matters in practice, but it seems like we're copying everything from our parent, so why not this field?
pkg/config/zonepb/zone.proto, line 141 at r6 (raw file):
// below) because the non-nullable nature of `constraints` and // `voter_constraints` means that there is no other way to disambiguate // between an unset `constraints` attribute and an empty one.
voter_constraints
pkg/config/zonepb/zone.proto, line 145 at r6 (raw file):
// NullVoterConstraintsIsEmpty specifies whether the VoterConstraints field // was explicitly set to be empty or if it was inherited from its parent.
I'd say again here what you say right above. "This field is needed because the non-nullable nature of voter_constraints means that there is no other way to disambiguate between an unset constraints attribute and an empty one."
pkg/server/diagnostics/reporter.go, line 378 at r6 (raw file):
} dst.VoterConstraints = make([]zonepb.ConstraintsConjunction, len(src.VoterConstraints)) dst.NullVoterConstraintsIsEmpty = src.NullVoterConstraintsIsEmpty
Should we be doing the same thing for InheritedConstraints up above?
ea5975f to
cd1efa2
Compare
This commit anonymizes voter_constraints info in our server diagnostics reporting. Release note: None
Resolves cockroachdb#61316 This commit changes the definition of the flag used to keep track of whether `voter_constraints` were empty or inherited. This allows legacy (pre-21.1) zone configs to be correctly interpreted by 21.1 nodes. Release note (backward-incompatible change): the internal representation of the `voter_constraints` zone configuration attribute (new in 21.1) has been altered in a way that's partially incompatible with the representation used by previous 21.1 betas (and the alphas that include this attribute). This means that Users who directly set the `voter_constraints` attribute to an empty list _will lose_ those constraints and will have to reset them.
cd1efa2 to
f361004
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Thanks for the reviews!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @otan)
pkg/config/zonepb/zone.go, line 278 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: this might be easier to read as
return len(z.VoterConstraints) == 0 && !z.NullVoterConstraintsIsEmpty
Done.
pkg/config/zonepb/zone.go, line 554 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should this be
z.NullVoterConstraintsIsEmpty = parent.NullVoterConstraintsIsEmpty? I don't think it matters in practice, but it seems like we're copying everything from our parent, so why not this field?
Done.
pkg/config/zonepb/zone.proto, line 141 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
voter_constraints
Fixed.
pkg/config/zonepb/zone.proto, line 145 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'd say again here what you say right above. "This field is needed because the non-nullable nature of voter_constraints means that there is no other way to disambiguate between an unset
constraintsattribute and an empty one."
Done.
pkg/server/diagnostics/reporter.go, line 378 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we be doing the same thing for
InheritedConstraintsup above?
It definitely seems like an oversight (though I think we're just dumping these anonymized configs somewhere for ~telemetry reasons), fixed for InheritedConstraints as well as InheritedLeasePreferences.
|
Build succeeded: |
A previous change (cockroachdb#63079) made this test flakey by (needlessly) making one of the marshalled fields a value derived from two other fields (as opposed to just one). This commit fixes the flake. Release note: None
63342: ui: add button to reset SQL stats r=Azhng a=Azhng Release note (ui change): User can now reset SQL stats from DB Console. Follow up to: cockroachdb/ui#271 Closes #33315 63401: ui: e2e tests, improved cypress test for tsx, overview and db page r=nanduu04 a=nanduu04 Previously, only visual regression tests were written Wrote cypress tests using the cypress-testing-library [CRDB-2388](https://cockroachlabs.atlassian.net/browse/CRDB-2388?atlOrigin=eyJpIjoiOWIzNzY0ZjkzY2RhNGU3ZTgyZTUyOTY5MjE4ZmM5YmIiLCJwIjoiaiJ9) Release note: None 63713: config: deflake TestMarshalableZoneConfigRoundTrip r=aayushshah15 a=aayushshah15 A previous change (#63079) made this test flakey by (needlessly) making one of the marshalled fields a value derived from two other fields (as opposed to just one). This commit fixes the flake. Release note: None 63726: install: fix log flags in roachprod start r=knz a=tbg I broke this in #63472, twofold. 1. if the version switch returned false, we weren't passing log flags at all, so there weren't any log files. 2. the version switch returned false on current master, since 21.1 is not tagged yet (we're still in the prerelease versions, currently at 21.1.0-alpha.3). I tested that this fix works. Release note: None Co-authored-by: Azhng <archer.xn@gmail.com> Co-authored-by: Nandu Pokhrel <nandup@cockroachlabs.com> Co-authored-by: Aayush Shah <aayush.shah15@gmail.com> Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Resolves #61316
This commit inverts the definition of the flag used to keep track of
whether
voter_constraintswere empty or inherited. This allows legacy(pre-21.1) zone configs to be correctly interpreted by 21.1 nodes.
Release note (backward-incompatible change): the internal representation
of the
voter_constraintszone configuration attribute (new in 21.1)has been altered in a way that's partially incompatible with the
representation used by previous 21.1 betas (and the alphas that include
this attribute). This means that Users who directly set the
voter_constraintsattribute to an empty list will lose thoseconstraints and will have to reset them.