Skip to content

Conversation

@dolio
Copy link
Contributor

@dolio dolio commented Dec 12, 2025

This changes the implementation of the type checking context to be based on finger trees instead of plain lists.

Previously a context was a list of Elements and Infos, where the latter aggregated information about all the previous Elements in the list. This new implementation uses a finger tree of Elements with a modified Info type as the "measure" of the finger tree. Each Element has a singleton-like Info measure, and the information is aggregated up the tree.

This has significant asymptotic advantages in some cases. The Info makes it possible to split the context in logarithmic time by looking at which variables are in play in subsequences. Also, by modifying a few parts of the algorithm to operate on these (possibly ill-formed) sequences rather than [Element v loc], the aggregate Info is largely maintained. Previously, the algorithm would often discard the info and recalculate it from the elements when appending to the context, which is linear in the number of elements.

Certain pathological type checking examples can result in repeatedly splitting and rejoining large portions of the context. E.G. some tracing of the problem:

foo a b c d e f g h i j k l m n o p q r s t u v w x y z =
  (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z)

indicates that sequences of thousands of elements are repeatedly split off and rejoined from the context. This is actually the majority of the context, so most of it was being rebuilt from scratch. In this patch, the split is more efficient, and quite a bit of aggregate information should be saved, with only a portion from the rejoined context needed to make the large ill-formed portion of the context back into a valid one. Hence, checking this example is now ~10x faster (but the speedup is actually asymptotic (probably something like n * (log n)^k rather than quadratic).

The type checking algorithm is overall the same. Just some core pieces where the context is directly manipulated were altered to take advantage of the new structures.

Transcripts seem to pass without change. I'm not sure how comprehensive those are w/r/t type checking changes for something at such a fundamental level. But it's something at least.

The new representation allows for splitting and rejoining contexts more
efficiently. The `Info` type that used to be built up linearly,
left-to-right, is now aggregated monoidally (and implicitly) from the
elements stored in the tree. The info allows for efficiently splitting
contexts at a specific variable (necessary for type checking), and
preserves aggregated info in segments. By not discarding this aggregate
info, it can just be merged back together when the overall context is
rebuilt, which saves linear amounts of work when e.g. solving variables.
This appears to actually be better than the strict fields.
@dolio dolio requested review from aryairani and pchiusano December 12, 2025 22:00
@aryairani
Copy link
Contributor

Awesome, thanks!

Copy link
Member

@pchiusano pchiusano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Always great when a fancy data structure actually ends up being useful. :)

I think this is worth a merge and we dogfood it a bit.

@aryairani aryairani merged commit 43ba3f2 into trunk Dec 12, 2025
1 check passed
@aryairani aryairani deleted the topic/type-context branch December 12, 2025 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants