storage,roachpb: proto changes for learners#38147
storage,roachpb: proto changes for learners#38147craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
|
@tbg no hurry on this review, it can definitely wait until you're back from the trip |
|
I reviewed these two commits on the main PR, I think I had minor comments - perhaps just address them there, force-push it to this branch and then LGTM |
This requires some plumbing through SnapshotRequest, which didn't previously have a field for this. Release note (admin ui change): The snapshots graph on the Replication page in the web ui now includes learner snapshots.
Introducing this early in the process because I suspect I'll be doing a lot of rebasing and generated files make that more annoying. This commit adds a voter vs learner type to the ReplicaDescriptor proto. However, nothing actually creates these learner replicas yet. Release note: None
|
Sounds good, TFTR! bors r=tbg |
Build succeeded |
| (gogoproto.customname) = "ReplicaID", (gogoproto.casttype) = "ReplicaID"]; | ||
|
|
||
| // Type indicates which raft activities a replica participates in. | ||
| optional ReplicaType type = 4 [(gogoproto.nullable) = false]; |
There was a problem hiding this comment.
Hmm, wait a minute. This is non-nullable, which means it'll always be encoded on the wire. This means that the bytes before this PR and after this PR change even if learners aren't used. I think this is problematic because we use RangeDescriptor in CPut (for splits and merges). Unfortunately, I think this means that this needs to be nullable.
As a basic example, if you create a cluster with 19.1 and then switch to this binary, it shouldn't be able to carry out any splits -- the splits should fail on the CPut. Incidentally that's what @jordanlewis just saw over in #38183, maybe he ran into this?
There was a problem hiding this comment.
Assuming I'm not missing something here and things are peachy, two things should fall out of the fix:
- ideally a unit test that can just verify this CPut-compatibility of protos that we know need it - basically just hard-code the marshaled bytes of these protos' zero values in the test and fail if they change
- we have version upgrade tests, those ought to catch this. I think they may not because there aren't any splits done under the new version (and any queue-based splits failing doesn't fail the test). So we may just need to add an
ALTER TABLE SPLIT ATand that should either fail or hang (both we can detect).
There was a problem hiding this comment.
Ooof those allocations are pretty unfortunate, no? Isn't this used in a good number of places?
I was surprised gogo was even serializing this field when set to the default value, but sure enough: https://github.com/cockroachdb/cockroach/blob/master/pkg/roachpb/metadata.pb.go#L952-L970. Removing the nullable = false annotation does in fact do the right thing:
if m.Type != nil {
dAtA[i] = 0x20
i++
i = encodeVarintMetadata(dAtA, i, uint64(*m.Type))
}
I tried putting the same enum and proto into a proto3 syntax file and that also does the right thing. What are the issues embedding a proto3 struct into a proto2 one again?
func (m *ReplicaDescriptor3) MarshalTo(dAtA []byte) (int, error) {
var i int
_ = i
var l int
_ = l
if m.NodeID != 0 {
dAtA[i] = 0x8
i++
i = encodeVarintMetadata3(dAtA, i, uint64(m.NodeID))
}
if m.StoreID != 0 {
dAtA[i] = 0x10
i++
i = encodeVarintMetadata3(dAtA, i, uint64(m.StoreID))
}
if m.ReplicaID != 0 {
dAtA[i] = 0x18
i++
i = encodeVarintMetadata3(dAtA, i, uint64(m.ReplicaID))
}
if m.Type != 0 {
dAtA[i] = 0x20
i++
i = encodeVarintMetadata3(dAtA, i, uint64(m.Type))
}
return i, nil
}
Finally, I wonder if the real issue here is the CPut (I guess I take back my "Yay for CPut" from the other PR). The proto spec is pretty loud about not relying on the exact bytes used to serialize a message and IIRC this is not the first time someone has hit this issue. What's the downside to turning the CPut into a Get and a Put and doing the not modified check on the deserialized structs? A bit of contention, but anything else? That seems to me to be the best long term fix to this
There was a problem hiding this comment.
That seems to me to be the best long term fix to this
I'm not aware that there is any downside to this (and I agree with the upside). Wanna lead the charge? Probably want to coordinate with @jeffrey-xiao who introduced a similar problem (see #38020).
In cockroachdb#38147, a new non-nullable field was added to the ReplicaDescriptor proto that is serialized inside RangeDescriptors. RangeDescriptors are updated using CPut to detect races. This means that if a RangeDescriptor had been written by an old version of cockroach and we then attempt to update it, the CPut will fail because the encoded version of it is different (non-nullable proto2 fields are always included). A similar issue was introduced in cockroachdb#38004 which made the StickyBit field on RangeDescriptor non-nullable. We could keep the fields as nullable, but this has allocation costs (and is also more annoying to work with). Worse, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes: From https://developers.google.com/protocol-buffers/docs/encoding#implications Do not assume the byte output of a serialized message is stable. So instead, we've decided to stop CPut-ing protos and to change the interface of CPut to take the expected value as bytes. To CPut a proto, the encoded value will be read from kv, passed around with the decoded struct, and used in the eventual CPut. This work is upcoming, but the above two PRs are real breakages, so this commit fixes those first. Neither of these PRs made the last alpha so no release note. Release note: None
In cockroachdb#38147, a new non-nullable field was added to the ReplicaDescriptor proto that is serialized inside RangeDescriptors. RangeDescriptors are updated using CPut to detect races. This means that if a RangeDescriptor had been written by an old version of cockroach and we then attempt to update it, the CPut will fail because the encoded version of it is different (non-nullable proto2 fields are always included). A similar issue was introduced in cockroachdb#38004 which made the StickyBit field on RangeDescriptor non-nullable. We could keep the fields as nullable, but this has allocation costs (and is also more annoying to work with). Worse, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes: From https://developers.google.com/protocol-buffers/docs/encoding#implications Do not assume the byte output of a serialized message is stable. So instead, we've decided to stop CPut-ing protos and to change the interface of CPut to take the expected value as bytes. To CPut a proto, the encoded value will be read from kv, passed around with the decoded struct, and used in the eventual CPut. This work is upcoming, but the above two PRs are real breakages, so this commit fixes those first. Neither of these PRs made the last alpha so no release note. Touches cockroachdb#38308 Release note: None
In cockroachdb#38147, a new non-nullable field was added to the ReplicaDescriptor proto that is serialized inside RangeDescriptors. RangeDescriptors are updated using CPut to detect races. This means that if a RangeDescriptor had been written by an old version of cockroach and we then attempt to update it, the CPut will fail because the encoded version of it is different (non-nullable proto2 fields are always included). A similar issue was introduced in cockroachdb#38004 which made the StickyBit field on RangeDescriptor non-nullable. We could keep the fields as nullable, but this has allocation costs (and is also more annoying to work with). Worse, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes: From https://developers.google.com/protocol-buffers/docs/encoding#implications Do not assume the byte output of a serialized message is stable. So instead, we've decided to stop CPut-ing protos and to change the interface of CPut to take the expected value as bytes. To CPut a proto, the encoded value will be read from kv, passed around with the decoded struct, and used in the eventual CPut. This work is upcoming, but the above two PRs are real breakages, so this commit fixes those first. Neither of these PRs made the last alpha so no release note. Touches cockroachdb#38308 Release note: None
38302: storage: CPut bytes instead of a proto when updating RangeDescriptors r=tbg a=danhhz In #38147, a new non-nullable field was added to the ReplicaDescriptor proto that is serialized inside RangeDescriptors. RangeDescriptors are updated using CPut to detect races. This means that if a RangeDescriptor had been written by an old version of cockroach and we then attempt to update it, the CPut will fail because the encoded version of it is different (non-nullable proto2 fields are always included). A similar issue was introduced in #38004 which made the StickyBit field on RangeDescriptor non-nullable. We could keep the fields as nullable, but this has allocation costs (and is also more annoying to work with). Worse, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes: From https://developers.google.com/protocol-buffers/docs/encoding#implications Do not assume the byte output of a serialized message is stable. So instead, we've decided to stop CPut-ing protos and to change the interface of CPut to take the expected value as bytes. To CPut a proto, the encoded value will be read from kv, passed around with the decoded struct, and used in the eventual CPut. This work is upcoming, but the above two PRs are real breakages, so this commit fixes those first. Neither of these PRs made the last alpha so no release note. Touches #38308 Closes #38183 Release note: None Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com>
As seen in cockroachdb#38004 and cockroachdb#38147, this first came up in the context of CPut with a proto expected value. If the serialization didn't exactly match, the CPut would get a condition failed error, even if the two protos were logically equivalent (missing field vs field with default value). We were hitting this in a mixed version cluster when trying to add a non-nullable field to RangeDescriptor, which is updated via CPut to detect update races. The new fields ended up having to be nullable, which has allocation consequences as well as ergonomic ones. In cockroachdb#38302, we changed the RangeDescriptor CPut to use a *roachpb.Value exactly as it was read from KV for the expected value, but no other callsites were updated. To prevent future issues, this commit changes the CPut signature to accept a *roachpb.Value for the expected value instead of an interface{}. The old method signature is left as CPutDeprecated for now. The CPut usages that are trivial to switch are switched and ones that require any thought are left for a followup PR. Release note: None
39714: client: force CPut callers to use a *roachpb.Value expected value r=tbg a=danhhz As seen in #38004 and #38147, this first came up in the context of CPut with a proto expected value. If the serialization didn't exactly match, the CPut would get a condition failed error, even if the two protos were logically equivalent (missing field vs field with default value). We were hitting this in a mixed version cluster when trying to add a non-nullable field to RangeDescriptor, which is updated via CPut to detect update races. The new fields ended up having to be nullable, which has allocation consequences as well as ergonomic ones. In #38302, we changed the RangeDescriptor CPut to use a *roachpb.Value exactly as it was read from KV for the expected value, but no other callsites were updated. To prevent future issues, this commit changes the CPut signature to accept a *roachpb.Value for the expected value instead of an interface{}. The old method signature is left as CPutDeprecated for now. The CPut usages that are trivial to switch are switched and ones that require any thought are left for a followup PR. Release note: None Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com>
Getting this in early to avoid repeatedly rebasing generated files.