storage: kv.atomic_replication_changes=true by tbg · Pull Request #40464 · cockroachdb/cockroach

tbg · 2019-09-04T10:31:23Z

I ran the experiments in

on (essentially) this branch and everything passed.

Going to run another five instances of mixed-headroom and headroom with this
change to shake out anything else that I might've missed.

Release note (general change): atomic replication changes are now enabled by
default.

cockroach-teamcity · 2019-09-04T10:31:30Z

This change is

awoods187 · 2019-09-04T10:34:32Z

This is exciting!

tbg · 2019-09-04T13:15:15Z

Hitting "descriptor changed" errors again in mixed-headroom, which means another migration concern:

func TestFoo(t *testing.T) {
	s := []byte("_oS\267\003\010\032\022\010\275\211\366\314\212\206\367\262\032\001\302\"\006\010\003\020\003\030\001\"\006\010\002\020\002\030\002\"\006\010\001\020\001\030\004(\0050\003:\n\010\262\202\234\305\351\313\316\340\025")
	v := Value{RawBytes: s}
	var desc RangeDescriptor
	require.NoError(t, v.GetProto(&desc))
	t.Error(pretty.Sprint(&desc))
}

    data_test.go:1891: &roachpb.RangeDescriptor{
            RangeID:          26,
            StartKey:         {0xbd, 0x89, 0xf6, 0xcc, 0x8a, 0x86, 0xf7, 0xb2},
            EndKey:           {0xc2},
            InternalReplicas: {
                {
                    NodeID:    3,
                    StoreID:   3,
                    ReplicaID: 1,
                    Type:      (*roachpb.ReplicaType)(nil),
                },
                {
                    NodeID:    2,
                    StoreID:   2,
                    ReplicaID: 2,
                    Type:      (*roachpb.ReplicaType)(nil),
                },
                {
                    NodeID:    1,
                    StoreID:   1,
                    ReplicaID: 4,
                    Type:      (*roachpb.ReplicaType)(nil),
                },
            },
            NextReplicaID:        5,
            Generation:           &int64(3),
            GenerationComparable: (*bool)(nil),
            StickyBit:            &hlc.Timestamp{WallTime:1567598323264061746, Logical:0},

The StickyBit should not be set; we're running in 19.1 mode and 19.1 doesn't know about the sticky bit.

I found two callers that didn't check the cluster version, it seems likely that they're used during IMPORT and caused this problem.

Added a commit to address this. I saw this in 2 out of 10 runs, so it's going to take a bit of time to confirm it's gone, but pretty sure this is it

tbg · 2019-09-04T14:22:51Z

5/5 headrooms and 4/5 mixed-headrooms passed. The one failure is the above, and is not related to this PR.

Going to kick of another 10 mixed-headrooms with the presumed fix for the split error.

tbg · 2019-09-04T20:03:34Z

@nvanbenschoten could you give me a review here? This PR was trivial but now it has a fix for the botched sticky migration I discovered while testing, and that patch won't apply cleanly on master.

tbg · 2019-09-04T20:06:50Z

PS don't want to withhold the satisfying mixed-headroom results:

--- PASS: tpcc/mixed-headroom/n5cpu16 11346.62s
--- PASS: tpcc/mixed-headroom/n5cpu16 11528.79s
--- PASS: tpcc/mixed-headroom/n5cpu16 11709.03s
--- PASS: tpcc/mixed-headroom/n5cpu16 11777.74s
--- PASS: tpcc/mixed-headroom/n5cpu16 11756.96s
--- PASS: tpcc/mixed-headroom/n5cpu16 11777.13s
--- PASS: tpcc/mixed-headroom/n5cpu16 11818.02s
--- PASS: tpcc/mixed-headroom/n5cpu16 12017.41s
--- PASS: tpcc/mixed-headroom/n5cpu16 12409.73s
--- PASS: tpcc/mixed-headroom/n5cpu16 12507.03s
PASS

nvb

Reviewed 2 of 2 files at r1, 2 of 3 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)

pkg/storage/replica_command.go, line 470 at r2 (raw file):

		newDesc := *desc
		newDesc.StickyBit = nil // can use &zero in 20.1

Is it possible to get here before all nodes update to 19.2, given the if (desc.GetStickyBit() == hlc.Timestamp{}) { check?

pkg/storage/bulk/sst_batcher.go, line 250 at r2 (raw file):

				log.Warning(ctx, err)
			} else {
				// NB: Passing 'hour' here is technically illegal until 19.2 is

Does all of the other complexity in this commit go away if we actually do this migration correctly and keep the use of AdminSplitRequest.ExpirationTime as the single migration boundary? It seems pretty easy to plumb a cluster setting into SSTBatcher and add logic like we have in restore.go -> splitAndScatter.

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

pkg/storage/replica_command.go, line 470 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is it possible to get here before all nodes update to 19.2, given the if (desc.GetStickyBit() == hlc.Timestamp{}) { check?

You're right, the early return prevents this. I added a comment instead of using &zero here because both are equivalent and nil is more obviously safe.

pkg/storage/bulk/sst_batcher.go, line 250 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Does all of the other complexity in this commit go away if we actually do this migration correctly and keep the use of AdminSplitRequest.ExpirationTime as the single migration boundary? It seems pretty easy to plumb a cluster setting into SSTBatcher and add logic like we have in restore.go -> splitAndScatter.

ExpirationTime is not encapsulated at all. We have 10 calls to settings.Version.IsActive(cluster.VersionStickyBit) before this PR, all at various callers to AdminSplit. The offending calls were added after the sticky bit was introduced (and don't handle the case in which the sticky bit is not available), simply because it's so hard to tell that there is a migration to worry about.

So yes, if nobody ever issued an incorrect AdminSplit, there wouldn't be a problem (except that I think something was still sometimes using &zero here instead of nil).

I can add the cluster version check here if it tickles your OCD that I omitted it, but I'd rather not make any changes in the storage end of things, because a) I don't want to rely on the callers as outline before, and b) the moment I make a substantial change here I'll have to spend hours re-validating the results, and I don't think that's worth it.

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

pkg/storage/replica_command.go, line 74 at r4 (raw file):

		} else if err := detail.ActualValue.GetProto(&actualDesc); err == nil &&
			desc.RangeID == actualDesc.RangeID && !desc.Equal(actualDesc) {
			return fmt.Sprintf("descriptor changed: [expected] %+#v != [actual] %+#v", desc, &actualDesc), true

This is unintentional debugging detritus. Reminder to self to remove

nvb

if you don't think it's worth cleaning up the migration further.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)

pkg/storage/replica_command.go, line 246 at r4 (raw file):

}

func splitTxnStickyUpdateAttempt(

I think this was assuming that it could always set StickyBit to a non-nil value because it was only called in the if desc.GetStickyBit().Less(args.ExpirationTime) { case, which was only possible if the correct cluster version was enabled. That should have been documented, but that should still be the case.

pkg/storage/bulk/sst_batcher.go, line 250 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

ExpirationTime is not encapsulated at all. We have 10 calls to settings.Version.IsActive(cluster.VersionStickyBit) before this PR, all at various callers to AdminSplit. The offending calls were added after the sticky bit was introduced (and don't handle the case in which the sticky bit is not available), simply because it's so hard to tell that there is a migration to worry about.

So yes, if nobody ever issued an incorrect AdminSplit, there wouldn't be a problem (except that I think something was still sometimes using &zero here instead of nil).

I can add the cluster version check here if it tickles your OCD that I omitted it, but I'd rather not make any changes in the storage end of things, because a) I don't want to rely on the callers as outline before, and b) the moment I make a substantial change here I'll have to spend hours re-validating the results, and I don't think that's worth it.

If you feel comfortable with the degree of validation that this setup for the migration has gotten then I am. It just seemed easier to fix the botched migration with SplitAndScatter than forcing the handling of AdminSplit to need to worry about non-empty ExpirationTime values.

tbg

bors r=nvanbenschoten

TFTR!

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)

tbg · 2019-09-06T14:45:37Z

flake was DML txn retry in logic test: #40542 (comment)

bors r=nvanbenschoten

craig · 2019-09-06T15:26:27Z

Build failed (retrying...)

GitHub CI (Cockroach)

tbg · 2019-09-06T16:23:28Z

Bors you liar

bors r=nvanbenschoten

tbg · 2019-09-06T21:31:32Z

I wonder if this PR crashes bors every time?

bors r=nvanbenschoten

nvb · 2019-09-07T02:52:25Z

bors r+

nvb · 2019-09-08T23:27:10Z

Let's try this again.

bors r=nvanbenschoten

40464: storage: kv.atomic_replication_changes=true r=nvanbenschoten a=tbg I ran the experiments in #40370 (comment) on (essentially) this branch and everything passed. Going to run another five instances of mixed-headroom and headroom with this change to shake out anything else that I might've missed. Release note (general change): atomic replication changes are now enabled by default. Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>

Before 19.2, callers to AdminSplit are only ever supposed to pass `Expiration==hlc.Timestamp{}` as this triggers the legacy code path necessary while nodes in the cluster may not know desc.StickyBit yet. To avoid CPut problems when those nodes update the range descriptor, we must not persist non-nil values on it in that case. It looked like we would sometimes still persist a &zero, which could cause problems. The bigger problem though was that there were also two callers that straight-up didn't check the cluster version and passed nonzero values into AdminSplit. These callers were added recently and I can't blame anyone there; it is impossible to know that one argument needs to be zero before 19.2. Instead of trying to fix this invariant (which wasn't trivial in this case) just ignore expiration times when the coordinator doesn't think 19.2 is already active. This could lead to sticky bits being ignored right around the cluster transition, but that seems much better than risking old nodes not being able to carry out any changes to the descriptors any more (which is the consequence of writing non-nil sticky bits before this is safe). Release note (bug fix): Fix a cluster migration bug that could occur while running in a mixed 19.1/19.2 cluster. The symptom would be messages of the form: ``` X at key /Table/54 failed: unexpected value: ... ``` Affected clusters should be updated to 19.2 or, if 19.1 is desired, recreated from a backup.

Release note (general change): atomic replication changes are now enabled by default.

nvb · 2019-09-09T19:24:47Z

Rebased on master and pushed again. Let's see if that helps.

bors r=nvanbenschoten

40464: storage: kv.atomic_replication_changes=true r=nvanbenschoten a=tbg I ran the experiments in #40370 (comment) on (essentially) this branch and everything passed. Going to run another five instances of mixed-headroom and headroom with this change to shake out anything else that I might've missed. Release note (general change): atomic replication changes are now enabled by default. Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>

craig · 2019-09-09T19:58:07Z

Build succeeded

GitHub CI (Cockroach)

tbg force-pushed the atomic/flip-switch-really branch from 901e38d to b6612c7 Compare September 4, 2019 13:35

tbg requested a review from nvb September 4, 2019 20:03

nvb reviewed Sep 4, 2019

View reviewed changes

tbg requested a review from nvb September 5, 2019 09:13

tbg commented Sep 5, 2019

View reviewed changes

tbg force-pushed the atomic/flip-switch-really branch from b6612c7 to 78e4c88 Compare September 5, 2019 11:01

tbg commented Sep 5, 2019

View reviewed changes

nvb approved these changes Sep 5, 2019

View reviewed changes

tbg force-pushed the atomic/flip-switch-really branch from 78e4c88 to 8c910ec Compare September 6, 2019 12:39

tbg requested a review from nvb September 6, 2019 12:40

tbg commented Sep 6, 2019

View reviewed changes

tbg added 2 commits September 9, 2019 15:24

storage: kv.atomic_replication_changes=true

b93a4ee

Release note (general change): atomic replication changes are now enabled by default.

nvb force-pushed the atomic/flip-switch-really branch from 8c910ec to b93a4ee Compare September 9, 2019 19:24

craig bot merged commit b93a4ee into cockroachdb:master Sep 9, 2019

knz mentioned this pull request Sep 12, 2019

sql: cluster fails to reboot, "no allowed privileges found for system object" after restart with wrong binary #40702

Closed

Conversation

tbg commented Sep 4, 2019

Uh oh!

cockroach-teamcity commented Sep 4, 2019

Uh oh!

awoods187 commented Sep 4, 2019 via email • edited by tbg Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbg commented Sep 4, 2019

Uh oh!

tbg commented Sep 4, 2019

Uh oh!

tbg commented Sep 4, 2019

Uh oh!

tbg commented Sep 4, 2019

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

tbg left a comment

Choose a reason for hiding this comment

Uh oh!

tbg left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

tbg left a comment

Choose a reason for hiding this comment

Uh oh!

tbg commented Sep 6, 2019

Uh oh!

craig bot commented Sep 6, 2019

Build failed (retrying...)

Uh oh!

tbg commented Sep 6, 2019

Uh oh!

tbg commented Sep 6, 2019

Uh oh!

nvb commented Sep 7, 2019

Uh oh!

nvb commented Sep 8, 2019

Uh oh!

nvb commented Sep 9, 2019

Uh oh!

craig bot commented Sep 9, 2019

Build succeeded

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

awoods187 commented Sep 4, 2019 via email •

edited by tbg

Loading