Skip to content

Versioning of bytes serializable protocol entities#2767

Merged
KPrasch merged 19 commits intonucypher:mainfrom
KPrasch:versioning
Sep 29, 2021
Merged

Versioning of bytes serializable protocol entities#2767
KPrasch merged 19 commits intonucypher:mainfrom
KPrasch:versioning

Conversation

@KPrasch
Copy link
Member

@KPrasch KPrasch commented Aug 4, 2021

Type of PR:
Feature

Required reviews:
2

What this does:
The goal of this PR is to gently introduce a unified version and branding scheme for the following entities (alphabetical):

Entity Prefix V Header
Arrangement b'AR' 1.0 b'AR\x00\x01\x00\x00'
AuthorizedFrag b'KF' 1.0 b'KF\x00\x01\x00\x00'
EncryptedTreasureMap b'EM' 1.0 b'EM\x00\x01\x00\x00'
MessageKit b'MK' 1.0 b'MK\x00\x01\x00\x00'
RetrievalKit b'RK' 1.0 b'RK\x00\x01\x00\x00'
ReencryptionRequest b'RQ' 1.0 b'RQ\x00\x01\x00\x00'
ReencryptionResponse b'RR' 1.0 b'RR\x00\x01\x00\x00'
RevocationOrder b'RV' 1.0 b'RV\x00\x01\x00\x00'
TreasureMap b'TM' 1.0 b'TM\x00\x01\x00\x00'

Related changes:

  • Introduces Versioned ABC (inspired by @fjarri 's original gist).
  • Subclasses Versioned and implement abstract methods on bytes-serializable entities listed above.
  • Prepends six-byte header to bytestrings (two-byte brand, two-byte major version, two-byte minor version).
  • Enforces Invalid or incompatible version and branding at deserialization time.
  • Renames Revocation -> RevocationOrder.
  • Adjusts size constant AuthorizedKeyFrag.ENCRYPTED_SIZE to accommodate headers.

Issues fixed/closed:
#343 #281

Why it's needed:

  • Required for management of future release backwards compatibility
  • Supports immediate backwards-incompatibility with "Treasure Map Con KFrags"

Notes for reviewers:

@KPrasch KPrasch changed the title [WIP] Versioning of bytes serializble protocol entities [WIP] Versioning of bytes serializable protocol entities Aug 4, 2021
@fjarri fjarri mentioned this pull request Aug 15, 2021
@KPrasch KPrasch force-pushed the versioning branch 2 times, most recently from 348eea8 to 51bc193 Compare September 20, 2021 20:45
KPrasch added a commit to KPrasch/nucypher that referenced this pull request Sep 20, 2021
@KPrasch KPrasch requested review from derekpierre and fjarri and removed request for derekpierre September 20, 2021 21:03
@KPrasch KPrasch changed the title [WIP] Versioning of bytes serializable protocol entities Versioning of bytes serializable protocol entities Sep 20, 2021
@KPrasch KPrasch marked this pull request as ready for review September 20, 2021 21:27
@KPrasch KPrasch requested a review from vepkenez September 20, 2021 21:33
@KPrasch KPrasch added API Enhancement New or improved features labels Sep 20, 2021
Copy link
Member

@derekpierre derekpierre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supports immediate backwards-incompatibility with "Treasure Map Con KFrags"

Given the backwards incompatibility with TMapConKfrags, in addition to the feature newsfragment, should we also add a removal newsfragment to indicate deprecation of the non- TMapConKFrags (TMapSinKFrags)?

Provide backwards compatibility for unversioned entities?
Ursula subclasses Versioned too? This will cause an additional major backwards incompatibility in node discovery.

I don't think I understand enough about the future ramifications of not doing this. Could you clarify those for my benefit? Presumably, if we do some sort of Ursula level versioning, we may end up with a fragmented network that may also have implications with respect to sampling of Ursulas i.e. Alice's choice of Ursulas?

@KPrasch
Copy link
Member Author

KPrasch commented Sep 21, 2021

I don't think I understand enough about the future ramifications of not doing this. Could you clarify those for my benefit? Presumably, if we do some sort of Ursula level versioning, we may end up with a fragmented network that may also have implications with respect to sampling of Ursulas i.e. Alice's choice of Ursulas?

Let's start by understanding that new (de)serialization formats for the aforementioned entities will no longer work with previous versions of the code base. This creates several surfaces of backwards-incompatibility, for example:

  1. Existing treasure maps generated by previous versions will no longer be usable by Alice or Bob.
  2. Existing MessageKits will no longer be decryptable by Alice, Bob, or Ursula running an older version.
  3. WorkOrders (now ReencryptionRequest)` generated by older versions will no longer be usable by Ursulas.
  4. Offline revocations will no longer be deserializable by Alice or Ursula.

Provide backwards compatibility for unversioned entities?

Currently I expect the failure mode for using outdated bytes to be caused by a missing brand or invalid version bytes . We can potentially improve this behavior by detecting a missing header or invalid version bytes. One issue with this approach is that there is no way to determine if a given bytes blob is outdated or simply invalid since the start bytes will be random. We can potentially even attempt to maintain backwards-compatibility for old policy entities but think this is a bad idea right now since we can still make breaking changes without too much impact.

Ursula subclasses Versioned too? This will cause an additional major backwards incompatibility in node discovery.

To consider Ursula... if we implement the same versioning scheme we'll definitely need to provide backwards compatibility to prevent segmenting the network, sampling, etc. This will require some additional work in this PR if we decide that it belongs in this changeset.

@KPrasch
Copy link
Member Author

KPrasch commented Sep 21, 2021

An additional consideration that I'd prefer to address in a follow-up is supporting a range of backwards-compatible versions instead of just one. This approach might be implemented with semantic versioning but consider the trade off between complexity and fine-tuning.

def __init__(self,
ursula_checksum_address: ChecksumAddress, # TODO: Use staker address instead (what if the staker rebonds)?
ursula_checksum_address: ChecksumAddress,
# TODO: Use staker address instead (what if the staker rebonds)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really the topic of this PR, but I think it is the staker address - it comes from the treasure map

@fjarri
Copy link
Contributor

fjarri commented Sep 23, 2021

Some general notes:

  • the list in the description is missing RetrievalKit and AuthorizedKeyFrag, which are also versioned
  • we may want to make Arrangement versioned as well
  • Ursula info (see Ursula.payload_splitter()) is currently versioned independently - do we bring it to this scheme too?
  • Another possible protocol object is the stuff returned from server.py::all_known_nodes() (currently not even an object)
  • I am worried about possible transition to nucypher-core where we may want to use serde-specific versioning packages which will be hard to connect to this scheme

@KPrasch
Copy link
Member Author

KPrasch commented Sep 23, 2021

the list in the description is missing RetrievalKit and AuthorizedKeyFrag, which are also versioned
✔️

we may want to make Arrangement versioned as well
✔️

Ursula info (see Ursula.payload_splitter()) is currently versioned independently - do we bring it to this scheme too? Another possible protocol object is the stuff returned from server.py::all_known_nodes() (currently not even an object)

It certainly will not hurt. Shall I go ahead and also version Ursula and make a class for UrsulaInfo in this PR? I'm thinking "yes".

I am worried about possible transition to nucypher-core where we may want to use serde-specific versioning packages which will be hard to connect to this scheme.

It may be good to have a scheme that is compatible across different language implementations. Will serde add additional constraints in this regard, or can it be easily made compatible in python? I'm eager to get something delivered for versioning in the next release (hopefully as soon as this PR is merged) since many entities are already completely backwards incompatible.

This system is elegant and is working right now. Plus, it's a very simple scheme, why will it be hard to implement in nucypher-core?

@fjarri
Copy link
Contributor

fjarri commented Sep 23, 2021

It certainly will not hurt. Shall I go ahead and also version Ursula and make a class for UrsulaInfo in this PR? I'm thinking "yes".

I'm looking into it myself right now, to separate it into nucypher-core... and it's a massive can of worms. But perhaps if you just want to replace the existing versioning scheme, it will be easier to do.

It may be good to have a scheme that is compatible across different language implementations. Will serde add additional constraints in this regard, or can it be easily made compatible in python?

Well, that's why we wanted to have a core in Rust and bindings to other languages :) But really, it will depend on what we choose as the "reference" binary implementation, and on the specific versioning solution. My guess is it can be made compatible, but perhaps not with the specific scheme in this PR.

I'm eager to get something delivered for versioning in the next release (hopefully as soon as this PR is merged) since many entities are already completely backwards incompatible.

Yes, I agree with that. But I'm worried that if we do want to proceed with nucypher-core, we will have to make another major release for it.

@vepkenez
Copy link
Contributor

I think using this as a reference implementation and attempting to replicate it in nucypher-core is the way to go. Hopefully any tweaks will not be substantial.

Keeping this in Python and not doing it in nucypher-core means that we need to copy/rewrite it in nucypher-ts and then in every other future language/platform forever.

@fjarri
Copy link
Contributor

fjarri commented Sep 23, 2021

I think using this as a reference implementation and attempting to replicate it in nucypher-core is the way to go. Hopefully any tweaks will not be substantial.

We will have to write our own serializer for that and not use any machinery that serde provides.

@vepkenez
Copy link
Contributor

We will have to write our own serializer for that and not use any machinery that serde provides.

Do you mean we won't be able to use the serde versioning libraries? Or can't use serde at all?

@KPrasch
Copy link
Member Author

KPrasch commented Sep 24, 2021

Also, while this PR is not merged yet, and seeing how there's already talk about supporting "ranges" of versions, I would like to revisit the suggestion to use "major.minor" versions instead of a single number.

@fjarri, I've incorporated this suggestion in ba6a49e. I've made the assumption that when the major version changes, backwards-compatibility with older bytes strings is broken but maintained for minor versions. In this sense the minor version number is just added metadata. Do you have a different sense of how this might operate?

@fjarri
Copy link
Contributor

fjarri commented Sep 25, 2021

Do you have a different sense of how this might operate?

Basically what you implemented, but slightly more relaxed: I think, if bytestring.major == class.major and bytestring.minor > class.minor, we can pass the bytestring to _from_bytes_current() and expect it to handle it. In most cases, yes, it will mean that later versions only add some fields but don't change existing ones, but I do not exclude the possibility that sometimes it will be possible to add some "future-proofing" (most likely, default values for fields).

I am not sure that ideologically it makes sense to keep deserializers for old major versions in the class as opposed to failing right away if bytestring.major != class.major; is it really still a different major version if we continue to support it?

@KPrasch
Copy link
Member Author

KPrasch commented Sep 27, 2021

...if bytestring.major == class.major and bytestring.minor > class.minor, we can pass the bytestring to _from_bytes_current() and expect it to handle it...

I am not sure that ideologically it makes sense to keep deserializers for old major versions in the class as opposed to failing right away if bytestring.major != class.major; is it really still a different major version if we continue to support it?

@fjarri, Considering the above two points. if we expect a minor version increase to be compatible with the current major version (_from_bytes_current) and we do not maintain previous major version handlers... what's the point of having _old_version_handlers at all?

@KPrasch
Copy link
Member Author

KPrasch commented Sep 27, 2021

Basically what you implemented, but slightly more relaxed: I think, if bytestring.major == class.major and bytestring.minor > class.minor, we can pass the bytestring to _from_bytes_current() and expect it to handle it.

✔️

I am not sure that ideologically it makes sense to keep deserializers for old major versions in the class as opposed to failing right away if bytestring.major != class.major

✔️

@fjarri
Copy link
Contributor

fjarri commented Sep 27, 2021

if we expect a minor version increase to be compatible with the current major version (_from_bytes_current) and we do not maintain previous major version handlers... what's the point of having _old_version_handlers at all?

Ok, fair enough, you're right. I think the guideline is that X.Y can be deserialized by X.Z as long as Y >= Z, and when you can't guarantee that, you need to bump the major version. But we need some limit on how long the old major versions are supported. Maybe to the point that they can be dropped at any time without further notice.

Copy link
Member

@derekpierre derekpierre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎸

@KPrasch Any thoughts on also adding a removal newsfragment to indicate deprecation of non-versioned object serialization?

@KPrasch
Copy link
Member Author

KPrasch commented Sep 28, 2021

indicate deprecation of non-versioned object serialization?

I'm a bit hesitant to do so since there are still some unversioned (or differently versioned) entities.

I suppose it can say something like "Removes support for almost all unversioned bytestrings." I'm open to suggestions on how to word this accurately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement New or improved features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants