Skip to content

Store value protobuf directly for MVCC values#49

Merged
spencerkimball merged 2 commits intomasterfrom
spencerkimball/mvcc-value-proto
Sep 11, 2014
Merged

Store value protobuf directly for MVCC values#49
spencerkimball merged 2 commits intomasterfrom
spencerkimball/mvcc-value-proto

Conversation

@spencerkimball
Copy link
Copy Markdown
Member

Instead of storing byte strings manually and manually varint encoding
integer values for incrementable keys, we now marshal and unmarshal the
Value protobuf directly. This allows us to Get incrementable keys and
to scan them. It also allows us to correctly discern an error incrementing
a key which is not a varint encoded value.

Implemented value checksums and checksum verification, although only
doing the verification now through kv/rest. Probably want to do the
same elsewhere.

integer values for incrementable keys, we now marshal and unmarshal the
Value protobuf directly. This allows us to Get incrementable keys and
to scan them. It also allows us to correctly discern an error incrementing
a key which is not a varint encoded value.

Implemented value checksums and checksum verification, although only
doing the verification now through kv/rest. Probably want to do the
same elsewhere.
@spencerkimball
Copy link
Copy Markdown
Member Author

Also, @Tobstar87 and @bdarnell, please review!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you wish to include the timestamp of Value into checksum?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From many of the comments here, I'm thinking it might make sense to rename the Checksum field "EndToEndChecksum". The idea is this checksum is set by the client -- perhaps as far back as from a Javascript client speaking over http, and then verified by any similar endpoint. It's meant to capture corruption anywhere in the communication stack, up to and including the final network interface and even the client's own potential memory stomping bugs.

As such, there's no reliable way to include the timestamp--remember that timestamps on writes can change if part of a transactions, and for non-transactional writes, the timestamp is taken as the receiving node's wall clock time, so can't be incorporated by the original client.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Computing the checksum in this way allows bytes to be shuffled between the key and value without changing the checksum. If we were using this checksum for security that would be a problem; I don't know if it matters here since we're using a weak checksum. If you want to do something about that I suggest hashing len(key) + key + len(value) + value.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure this addition will add much value to justify the complexity. This end-to-end checksum is mainly about verifying round-trip transport integrity from client to persistent db storage and then back to client.

@jiangmingyang
Copy link
Copy Markdown
Contributor

it looks good to me in general. Two general comments:

  1. i hope to put InitCheckSum into a central place and we'd better to add VerifyCheckSum accordingly.
  2. value not-exist vs value is empty is a little bit confusing. I would suggest to add deleted flag in Value since we always marshall it. We may simplify the mvcc prefix logic as well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this TODO can go, actually.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@bdarnell
Copy link
Copy Markdown
Contributor

LGTM

Added proto.Response.Verify method to verify checksums and now do verification
automatically on the responses of Get(), ConditionalPut() and Scan(). Also
now always initialize checksums on Put() and ConditionalPut().

Eventually we're going to want to fully expose the checksums to non-go
clients, but will have to evolve the KV API to include that capability.
spencerkimball added a commit that referenced this pull request Sep 11, 2014
Store value protobuf directly for MVCC values
@spencerkimball spencerkimball merged commit fe23b66 into master Sep 11, 2014
@spencerkimball spencerkimball deleted the spencerkimball/mvcc-value-proto branch September 11, 2014 19:09
soniabhishek pushed a commit to soniabhishek/cockroach that referenced this pull request Feb 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants