Skip to content

storage: don't crash when applying ChangeReplicas trigger with DeprecatedNextReplicaID#41148

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
nvb:nvanbenschoten/changeReplSM
Sep 27, 2019
Merged

storage: don't crash when applying ChangeReplicas trigger with DeprecatedNextReplicaID#41148
craig[bot] merged 1 commit intocockroachdb:masterfrom
nvb:nvanbenschoten/changeReplSM

Conversation

@nvb
Copy link
Copy Markdown
Contributor

@nvb nvb commented Sep 26, 2019

Fixes #41145.

This bug was introduced in #40892.

This may force us to pick a new SHA for the beta. Any ChangeReplicas
Raft entry from 19.1 or before is going to crash a node without it.

Release justification: fixes a crash in mixed version clusters.

Release note: None

@nvb nvb requested a review from ajwerner September 26, 2019 22:39
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test. Thanks for picking this up

Reviewed 1 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/replica_application_state_machine.go, line 675 at r1 (raw file):

		// providing a new range descriptor directly, which includes this info.
		var nextReplID roachpb.ReplicaID
		if change.Desc != nil {

I could see adding a method to the ChangeReplicasTrigger to hide this migration.

…atedNextReplicaID

Fixes cockroachdb#41145.

This bug was introduced in cockroachdb#40892.

This may force us to pick a new SHA for the beta. Any ChangeReplicas
Raft entry from 19.1 or before is going to crash a node without it.

Release justification: fixes a crash in mixed version clusters.

Release note: None
@nvb nvb force-pushed the nvanbenschoten/changeReplSM branch from 8f9657e to 15f5b81 Compare September 27, 2019 03:02
Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell)


pkg/storage/replica_application_state_machine.go, line 675 at r1 (raw file):

Previously, ajwerner wrote…

I could see adding a method to the ChangeReplicasTrigger to hide this migration.

This is only used here, so I think I'd rather call it out where it's needed.

craig bot pushed a commit that referenced this pull request Sep 27, 2019
41148: storage: don't crash when applying ChangeReplicas trigger with DeprecatedNextReplicaID r=nvanbenschoten a=nvanbenschoten

Fixes #41145.

This bug was introduced in #40892.

This may force us to pick a new SHA for the beta. Any ChangeReplicas
Raft entry from 19.1 or before is going to crash a node without it.

Release justification: fixes a crash in mixed version clusters.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 27, 2019

Build succeeded

@craig craig bot merged commit 15f5b81 into cockroachdb:master Sep 27, 2019
nvb added a commit to nvb/cockroach that referenced this pull request Sep 27, 2019
… trigger

Fixes cockroachdb#41155.

The fix in cockroachdb#41148 avoided a crash when staging a ChangeReplicas trigger with
a DeprecatedNextReplicaID in an application batch, but there was another bug
where applying the side-effects of such a command still caused a crash. This
commit fixes the crash and extends the test added in cockroachdb#41148 to go through the
whole process of applying the command (which would have caught the second
crash as well).

Release justification: fixes a crash in mixed version clusters.

Release note: None
craig bot pushed a commit that referenced this pull request Sep 27, 2019
41171: storage: don't crash when applying side-effects of old ChangeReplicas trigger r=nvanbenschoten a=nvanbenschoten

Fixes #41155.
Fixes #41147.

The fix in #41148 avoided a crash when staging a ChangeReplicas trigger with
a DeprecatedNextReplicaID in an application batch, but there was another bug
where applying the side-effects of such a command still caused a crash. This
commit fixes the crash and extends the test added in #41148 to go through the
whole process of applying the command (which would have caught the second
crash as well).

Release justification: fixes a crash in mixed version clusters.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@nvb nvb deleted the nvanbenschoten/changeReplSM branch October 14, 2019 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: version/mixed/nodes=5 failed

4 participants