Rework new serialization to be based on explicit numberings. #5818

dolio · 2025-07-30T00:08:40Z

This PR changes the internal format of Code and Value to be parameterized on reference type, and the canonicalization methodology replaces all Reference values with explicit numberings.

With this refactoring, I've added a new class Referential that abstracts over types with traversable references. The difference from just being traversable is that the function gets an extra Bool argument indicating whether the reference is to a type or a term. This helps with writing some common functions for canonicalizing references in a nicer way than before.

Also, the Canonicalizer machinery is largely out of the loop in this new implementation. When you reflect a value, the reflection functions try to directly map from local numberings to global numberings. There is still Canonicalizer stuff involved, because some things directly store only References, not local numberings. But, if you're not reflecting Code, or an already reflected value, the interaction is fairly minimal. This was the goal of this refactoring, because the StableName based Canonicalizer ended up being a bit unreliable. Now it is just a short cut optimization that will not matter in common cases.

In some cases, I did as little as was necessary to make things work. For instance, we still store Code Reference in the code cache, basically, which means we need to de-number what we deserialize, and add numbers when we serialize. This means I didn't have to touch the code loader, but a little performance might be left on the table (although, maybe for a case that doesn't matter a lot).

Speaking of performance, this numbering-based method is a little faster than the one before. Nothing huge, though. Maybe 10% at most.

I've still got some tests to run through, but unless I find bugs, this is complete.

In many cases, code has simply been modified to supply `Reference` as the parameter, but some functions that don't use any concrete references have been given a more general type.

ceedubs · 2025-07-30T13:34:23Z

I've confirmed that the Unison Cloud integration tests pass as of f13a476

Expected from opposite reference list ordering

dolio · 2025-07-30T16:02:41Z

The tests in my all-tests project pass.

Transcripts pass.

v5 serialized test cases are a little different, because as part of doing this, I reversed the order that the references are serialized in. Before they were prepended to the list that got serialized and now they're appended, so that you can immediately figure out the numbering while the term is being traversed. This isn't any kind of format change. There are just multiple representations for any given value based on the order you write out the references.

pchiusano

Nice!

dolio added 3 commits July 28, 2025 16:36

Parameterize intermediate code by reference type

6622d4d

In many cases, code has simply been modified to supply `Reference` as the parameter, but some functions that don't use any concrete references have been given a more general type.

Parameterize Value by reference type and rework serialization

9da9997

Fix tests

6e9b703

dolio requested review from ChrisPenner, aryairani, ceedubs, pchiusano and stew July 30, 2025 00:08

automatically run ormolu

f13a476

Serialized test case changes changes

0e2da26

Expected from opposite reference list ordering

dolio marked this pull request as ready for review July 30, 2025 15:47

pchiusano approved these changes Jul 30, 2025

View reviewed changes

pchiusano merged commit 1e554ac into trunk Jul 30, 2025
31 checks passed

pchiusano deleted the topic/numbered-reflection branch July 30, 2025 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rework new serialization to be based on explicit numberings. #5818

Rework new serialization to be based on explicit numberings. #5818

Uh oh!

dolio commented Jul 30, 2025

Uh oh!

ceedubs commented Jul 30, 2025

Uh oh!

dolio commented Jul 30, 2025

Uh oh!

pchiusano left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Rework new serialization to be based on explicit numberings. #5818

Rework new serialization to be based on explicit numberings. #5818

Uh oh!

Conversation

dolio commented Jul 30, 2025

Uh oh!

ceedubs commented Jul 30, 2025

Uh oh!

dolio commented Jul 30, 2025

Uh oh!

pchiusano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants