Skip to content

Zero alloc: symbolic summaries#2428

Closed
gretay-js wants to merge 20 commits intooxcaml:mainfrom
gretay-js:zero_alloc_symbolic7
Closed

Zero alloc: symbolic summaries#2428
gretay-js wants to merge 20 commits intooxcaml:mainfrom
gretay-js:zero_alloc_symbolic7

Conversation

@gretay-js
Copy link
Copy Markdown
Contributor

@gretay-js gretay-js commented Apr 10, 2024

This PR extends the existing abstract domain with a new abstract value that keeps track of unresolved dependencies as part of the abstraction. This allows us to compute precise summaries for recursive functions without keeping the Mach representation of these functions around until the end of the analysis of the entire compilation unit.

The new abstract value is "symbolic" in the sense that it refers to a summary of an unresolved callee using a "logical" variable.
The meaning of a logical variable can be any of of the abstract values we had before this PR, i.e., the "resolved" values Top, Safe, or Bot. A symbolic value can be any combination of symbolic "Join" or "Transform" operations on logical variables or "resolved" values.

Normal form

The analysis manipulates symbolic values in normal form that resembles DNF. This can be very large in the worst case (exponential worst case with the usual "diamond" examples). Intuitively, this normal form represents all possible paths
through the function and for each path it can only record if the path contains an allocation and which unresolved callees appear on the path. Commutativity and Associativity of Join and Transform help us keep this representation small.
In particular, the order in which the callees appear on each path does not matter. The analysis tries to eagerly resolve symbolic Join and Transform whenever possible (intuitively, partial evaluation when one of the arguments is resolved).

This DNF representation is usually much smaller than Mach IR of the function. However, this PR is naive in terms of memory representation of symbolic values. For example, there is no hash consing and many operations allocate even if the symbolic value does not change. This is intentional to make review easier. We can improve performance as a follow up PR.

Operations on symbolic values preserve normal form. This allows us to have special representation of normal form, without handling the general case of conversion of arbitrary formulas to DNF.

The advantage of DNF is that it is easy to check for fixed point (i.e., implement lessequal operation) using structural comparison on terms. An alternative to DNF is to have a hierarchical normal form (smaller representation due to more sharing) with a more expensive fixed point check. Another possiblity is to use BDD-like representation if performance becomes a bottleneck.

Tracking witnesses

It complicates the domain and makes the analysis of unresolved callees more expensive. Only functions that have [@zero_alloc ..] annotations track witnesses by default. We use -checkmach-details-cutoff flag to keep all witnesses for visualization. Tracking all witnesses can be infeasible on compilation units with large recursive functions (for example, see parsing/depend.ml). There is a flag to bale out in such a case -disable-precise-checkmach.
A separate PR will provide a way to limit the size of symbolic values with witnesses (likely to be a rework of #1472).

Precision

This PR gives exactly the same results as before, except for some recursive functions on which this PR can be more precise because previously we used Value.Safe as initial value for unresolved calls and now it is Value.Diverge. Existing test case updated. Manual comparison of summaries from .CMX files confirms that the results are correct for the few functions that have different summaries.

Code review

There is a small refactoring in Zero_alloc_utils to abstract the representation of resolved values, so we can hide the gory details of the symbolic domain in the backend. This refactoring also renames as "V" as "Component" as @ccasin suggested in another PR. I'm happy to move it to a separate PR if reviewers prefer.

Testing:

  • Existing compiler tests for checkmach pass.
  • Comparing summaries saved in .cmx files before and after the PR: on all the code of the compiler.
  • Measure compilation time overhead on a large codebase.

@gretay-js gretay-js requested review from ccasin and xclerc April 10, 2024 18:11
@gretay-js gretay-js force-pushed the zero_alloc_symbolic7 branch from 7b8c969 to 022e0a4 Compare April 10, 2024 18:17
Copy link
Copy Markdown
Contributor

@xclerc xclerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already discussed off-line, I kind of like this
design. My main concern is about how witnesses
play a role in the comparison used for sets and
maps. With the replacement of witnesses, values
that were distinct may be merged and break
invariants about the cardinality of a set or map.
I don't think it currently happens, but it is not that
easy to be confident it doesn't.

It may be valuable to gather the elements about
the "constructors" (e.g. join, transform) in one
place, with the theoretical properties (e.g. neutral
or absorbing elements, distribution), while keeping
the comments about implementation choices and/or
imperatives (e.g. why/how witnesses are tracked)
next to the code.

Not for this pull request, but I wonder whether
some concrete representations could be tweaked.
For instance, Join.t could be defined as a
record { args : Args.t; ... } with the ellipsis
encoding top/safe. It may simplify a couple of
functions in the Join module, and perhaps
lead to a type like { args : Args.t; rest : 'a; }
that could then be used in other modules
such as Transform.

I am not positive it would be beneficial, as
I have not tried to refactor the code and if
it does happen, it should happen in another
pull request. The fact that the current version
completely hides the concrete representations
would probably make the change easy to test.

Copy link
Copy Markdown
Contributor Author

@gretay-js gretay-js left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this pull request, but I wonder whether
some concrete representations could be tweaked.
For instance, Join.t could be defined as a
record { args : Args.t; ... } with the ellipsis
encoding top/safe. It may simplify a couple of
functions in the Join module, and perhaps
lead to a type like { args : Args.t; rest : 'a; }
that could then be used in other modules
such as Transform.

I had a version like this earlier with rest as boolean and it wasn't as easy to safely manipulate, but can to improve on it later.

It may be valuable to gather the elements about
the "constructors" (e.g. join, transform) in one
place, with the theoretical properties (e.g. neutral
or absorbing elements, distribution), while keeping
the comments about implementation choices and/or
imperatives (e.g. why/how witnesses are tracked)
next to the code.

good idea, thank you. i'll try separately.

@gretay-js
Copy link
Copy Markdown
Contributor Author

Regarding compilation time overhead, I built a very large code base with this PR and optimizations on. On most targets the time and memory overhead are within noise.

There were 10 files on which the overhead was excessive or compilation ran out of memory while computing symbolic summaries. These are huge generated files with many "forward functions" generated by ppxes. With -function-layout topological these files build successfully. We can turn off summary generation using -disable-precise-mach on these files. To cap the cost automatically, I added a widening operation that I will post as a follow up PR.

@mshinwell mshinwell added the zero alloc zero-alloc check label May 2, 2024
@gretay-js
Copy link
Copy Markdown
Contributor Author

Included in #2524

@gretay-js gretay-js closed this May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend zero alloc zero-alloc check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants