Rework new serialization to be based on explicit numberings. #5818
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR changes the internal format of
CodeandValueto be parameterized on reference type, and the canonicalization methodology replaces allReferencevalues with explicit numberings.With this refactoring, I've added a new class
Referentialthat abstracts over types with traversable references. The difference from just being traversable is that the function gets an extraBoolargument indicating whether the reference is to a type or a term. This helps with writing some common functions for canonicalizing references in a nicer way than before.Also, the
Canonicalizermachinery is largely out of the loop in this new implementation. When you reflect a value, the reflection functions try to directly map from local numberings to global numberings. There is stillCanonicalizerstuff involved, because some things directly store onlyReferences, not local numberings. But, if you're not reflectingCode, or an already reflected value, the interaction is fairly minimal. This was the goal of this refactoring, because theStableNamebasedCanonicalizerended up being a bit unreliable. Now it is just a short cut optimization that will not matter in common cases.In some cases, I did as little as was necessary to make things work. For instance, we still store
Code Referencein the code cache, basically, which means we need to de-number what we deserialize, and add numbers when we serialize. This means I didn't have to touch the code loader, but a little performance might be left on the table (although, maybe for a case that doesn't matter a lot).Speaking of performance, this numbering-based method is a little faster than the one before. Nothing huge, though. Maybe 10% at most.
I've still got some tests to run through, but unless I find bugs, this is complete.