Skip to content

kvserver: stop using deprecated snapshot fields#102629

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
andrewbaptist:2023-04-28-cleanup-snapshot
Sep 26, 2023
Merged

kvserver: stop using deprecated snapshot fields#102629
craig[bot] merged 1 commit intocockroachdb:masterfrom
andrewbaptist:2023-04-28-cleanup-snapshot

Conversation

@andrewbaptist
Copy link
Copy Markdown

A number of fields in snapshot messages are no longer relevant. This PR stops using those fields completely so that in the next release they can be removed from the protocol completely.

Release note: None
Epic: none

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@andrewbaptist andrewbaptist force-pushed the 2023-04-28-cleanup-snapshot branch 2 times, most recently from 70c8d5c to 86ff39b Compare April 28, 2023 18:32
@andrewbaptist andrewbaptist force-pushed the 2023-04-28-cleanup-snapshot branch 2 times, most recently from e99d2c3 to 9ade6e7 Compare April 28, 2023 19:42
@andrewbaptist andrewbaptist requested a review from AlexTalks April 28, 2023 19:54
@andrewbaptist andrewbaptist force-pushed the 2023-04-28-cleanup-snapshot branch from 9ade6e7 to 6224e33 Compare April 28, 2023 21:55
@andrewbaptist andrewbaptist marked this pull request as ready for review April 29, 2023 14:02
@andrewbaptist andrewbaptist requested a review from a team April 29, 2023 14:02
@andrewbaptist andrewbaptist requested a review from a team as a code owner April 29, 2023 14:02
@erikgrinaker
Copy link
Copy Markdown
Contributor

erikgrinaker commented Apr 30, 2023

in the next release they can be removed from the protocol completely

Note that since 23.1 onwards we will support version-skipping upgrades, so we have to support upgrading directly from 23.1 to 24.1. It's thus no longer sufficient to simply stop using the field in 23.2 and then remove it in 24.1, because we may be upgrading directly from 23.1 to 24.1. There's some recommendations for how to deal with this in a more principled way here: https://cockroachlabs.atlassian.net/wiki/spaces/ENG/pages/2775876082/Guidelines+for+writing+cluster+version-specific+code+and+using+version+gates#7.-When-deprecating-a-field-that-is-still-referenced-by-older-versions,-add-a-piece-of-code-to-a-companion-.go-file-that-references-both-the-version-in-question-and-the-field.

@andrewbaptist andrewbaptist force-pushed the 2023-04-28-cleanup-snapshot branch from 6224e33 to 099413b Compare May 1, 2023 15:05
@andrewbaptist
Copy link
Copy Markdown
Author

in the next release they can be removed from the protocol completely

Note that since 23.1 onwards we will support version-skipping upgrades, so we have to support upgrading directly from 23.1 to 24.1. It's thus no longer sufficient to simply stop using the field in 23.2 and then remove it in 24.1, because we may be upgrading directly from 23.1 to 24.1. There's some recommendations for how to deal with this in a more principled way here: https://cockroachlabs.atlassian.net/wiki/spaces/ENG/pages/2775876082/Guidelines+for+writing+cluster+version-specific+code+and+using+version+gates#7.-When-deprecating-a-field-that-is-still-referenced-by-older-versions,-add-a-piece-of-code-to-a-companion-.go-file-that-references-both-the-version-in-question-and-the-field.

Thanks for pointing that page out. This PR doesn't actually change any protobuf structures or behavior, it simply removes the usage of the fields by using other redundant fields that are more clear. There is a "follow-on" PR that removes the fields, and I did update the names of the fields to mark them deprecated so we don't introduce new usages of them.

@erikgrinaker
Copy link
Copy Markdown
Contributor

erikgrinaker commented May 8, 2023

Thanks for pointing that page out. This PR doesn't actually change any protobuf structures or behavior, it simply removes the usage of the fields by using other redundant fields that are more clear. There is a "follow-on" PR that removes the fields, and I did update the names of the fields to mark them deprecated so we don't introduce new usages of them.

I understand, but the section I linked was specifically about linking deprecated fields to the last cluster version that relies on them, such that removing the field before support for that cluster version is dropped results in a compile error.

There were a couple of draft PRs discussing a Refer() method specifically for this, but looks like they never merged: #90818 #90428. There was also an email discussion. This might be a good opportunity to revive those discussions/PRs and establish best practices around this, considering 23.2 will be the first version that supports version-skipping upgrades (22.2 to 23.2).

Copy link
Copy Markdown
Contributor

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were there any further developments on linking deprecated fields to the last cluster version -- It doesn't look like it from the PR, but might be good to double check with @celiala.

I left some comments around the SnapshotRequest_OTHER and its metric accounting.

Reviewed 8 of 8 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @andrewbaptist)


pkg/kv/kvserver/replica_command.go line 3190 at r2 (raw file):

			r.store.metrics.RangeSnapshotRecoverySentBytes.Inc(inc)
		} else if header.SenderQueueName == kvserverpb.SnapshotRequest_OTHER {
			r.store.metrics.RangeSnapshotRebalancingSentBytes.Inc(inc)

Should this be RangeSnapshotUnknownSentBytes? When it is SnapshotRequest_OTHER it should be coming from a AdminChangeReplicasRequest here.


pkg/kv/kvserver/replica_command.go line 3191 at r2 (raw file):

		} else if header.SenderQueueName == kvserverpb.SnapshotRequest_OTHER {
			r.store.metrics.RangeSnapshotRebalancingSentBytes.Inc(inc)
		} else { // replicate queue does both types, so need to split

nit: Move inline comment down below.


pkg/kv/kvserver/store_snapshot.go line 1390 at r2 (raw file):

			s.metrics.RangeSnapshotRecoveryRcvdBytes.Inc(inc)
		} else if header.SenderQueueName == kvserverpb.SnapshotRequest_OTHER {
			s.metrics.RangeSnapshotRebalancingRcvdBytes.Inc(inc)

Similar to above, should this be using Unknown... instead here?


pkg/kv/kvserver/store_snapshot.go line 1391 at r2 (raw file):

		} else if header.SenderQueueName == kvserverpb.SnapshotRequest_OTHER {
			s.metrics.RangeSnapshotRebalancingRcvdBytes.Inc(inc)
		} else { // replicate queue does both types, so need to split

nit: Move inline comment down a line.

A number of fields in snapshot messages are no longer relevant. This PR
stops using those fields completely so that in the next release they can
be removed from the protocol completely.

Release note: None
Epic: none
@andrewbaptist andrewbaptist force-pushed the 2023-04-28-cleanup-snapshot branch from 1cd1401 to beb4d8b Compare September 25, 2023 18:02
Copy link
Copy Markdown
Author

@andrewbaptist andrewbaptist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR - I'll check with @celiala and see if there is anything else I should do.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @kvoli)


pkg/kv/kvserver/replica_command.go line 3190 at r2 (raw file):

Previously, kvoli (Austen) wrote…

Should this be RangeSnapshotUnknownSentBytes? When it is SnapshotRequest_OTHER it should be coming from a AdminChangeReplicasRequest here.

The unknown was never used in practice before. ChangeReplicas sends snapshots with a type of OTHER. Previously, these were all counted as "rebalance":

desc, err := r.ChangeReplicas(ctx, &tArgs.ExpDesc, kvserverpb.SnapshotRequest_REBALANCE, kvserverpb.ReasonAdminRequest, "", chgs)

This preserves this behavior.


pkg/kv/kvserver/replica_command.go line 3191 at r2 (raw file):

Previously, kvoli (Austen) wrote…

nit: Move inline comment down below.

Done


pkg/kv/kvserver/store_snapshot.go line 1390 at r2 (raw file):

Previously, kvoli (Austen) wrote…

Similar to above, should this be using Unknown... instead here?

See comment above


pkg/kv/kvserver/store_snapshot.go line 1391 at r2 (raw file):

Previously, kvoli (Austen) wrote…

nit: Move inline comment down a line.

Done

@andrewbaptist
Copy link
Copy Markdown
Author

@celiala can you specifically look at beb4d8b#diff-77204b7d0f7ebab5263421e5e5a0bba1c4f77274d4bad0433923dacb89634c68

Is this the right approach for handling deprecated fields now that we are planning to support version skipping. This change doesn't remove the fields, but specifically marks them as needed in releases after v23.1. Thanks!

@andrewbaptist andrewbaptist requested a review from kvoli September 25, 2023 18:04
Copy link
Copy Markdown
Collaborator

@celiala celiala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andrewbaptist, and @kvoli)


pkg/kv/kvserver/kvserverpb/raft.proto line 217 at r3 (raw file):

@celiala can you specifically look at beb4d8b#diff-77204b7d0f7ebab5263421e5e5a0bba1c4f77274d4bad0433923dacb89634c68

Is this the right approach for handling deprecated fields now that we are planning to support version skipping. This change doesn't remove the fields, but specifically marks them as needed in releases after v23.1. Thanks!

This pattern LGTM, in terms of how to handle deprecated fields - thank you for checking!

Copy link
Copy Markdown
Contributor

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks and @andrewbaptist)


pkg/kv/kvserver/replica_command.go line 3190 at r2 (raw file):

Previously, andrewbaptist (Andrew Baptist) wrote…

The unknown was never used in practice before. ChangeReplicas sends snapshots with a type of OTHER. Previously, these were all counted as "rebalance":

desc, err := r.ChangeReplicas(ctx, &tArgs.ExpDesc, kvserverpb.SnapshotRequest_REBALANCE, kvserverpb.ReasonAdminRequest, "", chgs)

This preserves this behavior.

Ack, I looked around 23.1 and didn't see any meaningful usages of SnapshotRequest_UNKNOWN, aside from the replicate queue which would be overridden anyway.

@andrewbaptist
Copy link
Copy Markdown
Author

bors r=kvoli

TFTR!

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 26, 2023

Build succeeded:

@craig craig bot merged commit 5d61611 into cockroachdb:master Sep 26, 2023
@andrewbaptist andrewbaptist deleted the 2023-04-28-cleanup-snapshot branch September 26, 2023 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants