Dissecting let-recs by AltGr · Pull Request #8956 · ocaml/ocaml

AltGr · 2019-09-20T13:56:56Z

This pull-request (started by Pierre Chambart) adds a pass in Lambda
that reorganises let-recs to lift all recursive value definitions. The
resulting lambda contains Lletrec only for recursive function
definitions, and initialisation + assignment sequences for recursive
values. The rewrite is made so that it preserves evaluation order for
side-effects (including for depdendent nested let-recs).

This is expected to be cleaner than the previous handling, and to make
the resulting lambda easier to work with. It's also our hope that it
will more easily be tied with the checks in Rec_check.

The PR contains:

a new module lambda/dissect_letrec.ml that does the translation
a tiny change to simplif.ml to trigger the rewrite
a tiny change in translmod.ml for the class handling case

I am now reasonably convinced that it complies with what Rec_check
considers valid or not, but we probably should go further in
justifying that part; I would very grateful if @gasche could have an
eye at it, which is the main reason for submitting already.

gasche · 2019-09-21T08:53:29Z

There is a very nice comment explaining the transform in dissect_let_rec.ml; pasting it below for reference:

(* Converts let-rec containing values into an initialization then
   assignment sequence.
   We assume that the typechecker correctly validated that the letrec
   is compilable (See typing/Rec_check.is_valid_recursive_expression).
   That is, for every expression to which a variable is bound in
   the let-rec verify:
   * It does not inspect any variable bound by the let-rec.
     This implies that the value of a variable bound by the let-rec
     does not need to be valid during the computation of the other
     bindings. Only the address is needed. This means that we can
     preallocate values for which we know the size.
   * If the value can't be preallocated (it can be because the
     size can't be statically known, or because the value is not
     allocated, like integers), then the value does not
     depend on any other rec-bound variables.
     This implies that any value that can't be preallocated can be
     computed before any other binding. Note that this does not mean
     that other variables can't be refered to by the expression,
     but if it happens, then the value is useless.
   We consider two cases for expressions that we consider of
   known size (the [Static] case in [is_valid_recursive_expression]):
   the makeblock primitive and function declarations. Makeblocks will
   be preallocated, while all the functions will be turned into a single
   letrec.
   The structure of the generated code will be:
   {[
     let c = const in
     (* [consts]: constants *)
     ...
     let v = caml_alloc_dummy n in
     (* [blocks]: With n the staticaly known size of blocks *)
     ...
     let p = expr in
     (* [pre]: Values that do not depend on any other from the let-rec in
        [Rec_check.Dereference] position *)
     ...
     let rec f x = ...
     and g x = ...
     and ...
     in
     (* [functions]: All the functions from the let-rec *)
     caml_update_dummy v v_contents;
     (* Initialisation ([effects]) *)
     ...
   ]}
   Special care is taken to handle nested [let rec]s:
   the recursive values of a letrec are often more than the ones
   bound by the letrec expression itself. For instance
   let rec f =
     let rec g x = h (x+1)
     and h x = i x
     in
     fun x -> g x
   and i x =
     if x > 33 then x
     else f x
   in this expression every function is recursively defined with
   any other. Hence this is equivalent to
   let rec f = fun x -> g x
   and i x =
     if x > 33 then x
     else f x
   and g x = h (x+1)
   and h x = i x
   However, there might be (a subset of) a local [let rec] that is indeed
   internally recursive, but that is used by the top-level [let rec] in
   [Dereference] positions. We carefully lift those within [pre], and handle
   them recursively in the same pass to preserve the order of evaluation.
   The analysis for which variables should remain in the inner [let rec], and
   which are indeed part of the outer [let rec] is equivalent to the
   [Rec_check.value_bindings] case.
*)

I have looked at nothing else in this file yet.

gasche · 2019-09-21T08:57:03Z

I think that this PR does two different (related) things:

it explicitates the backpatching semantics (initialize->evaluate->set) at the Lambda level
it tries harder to create larger immediate-functions-only blocks (which use the clever closure representation strategy), in fact it two different ways:
a. within a value-letrec block, isolating an immediate-functions-only sub-block
b. in let rec f = (let rec ... in ...) and ... scenarios, merging some of the inner-definitions into the outer-definitions to try to create larger closure blocks

From my current understanding (not having looked at the code yet) my judgment would be as follows:

(1) sounds like an excellent idea to me (currently the duplication of logics between bytecode and native is painful)
I'm less sold on (2). I think it adds a lot of complexity to the code, and the benefits are not clear to me (I haven't seen a discussion of them). More precisely, my impression is that (2.a) is reasonable because it is a purely local transformation, but (2.b) is a harder sell because it is in charge of moving things around, while still preserving evaluation order.

There is also the issue that recursive closures may worsen the lifetime of some captured values. This is rarely an issue, and expert users know about it, but from that perspective merging recursive-closures around does carry a part of risk.

I feel embarrassed by this first assessment because I haven't looked at the details yet, and I realize that (2.b) was a lot of work to get right, but there we go...

AltGr · 2019-09-23T15:53:37Z

More examples for anyone interested: AltGr@d311da3

chambart · 2019-09-23T15:57:53Z

@gasche The main reason for (2) is to avoid fixing the representation of closures at the lambda level. The generated lambda doesn't ever need to patch recursively defined closures. To reduce the lifetime of captured values, it would be possible to split that let-rec into smaller ones through a topological sort. It was not deemed important enough to warrant for the complexity.

lpw25 · 2019-09-26T10:31:59Z

Is there a particular reason that this code is triggered in simplif.ml rather than just called directly from translcore.ml? Doing it from translcore.ml would allow us to ban value letrecs at the lambda level entirely, perhaps adjusting the representation of Lletrec accordingly to only support functions. It also seems like that would make any future connection between the Rec_check code and the Dissect_let_rec code slightly easier.

mshinwell · 2019-09-26T10:34:21Z

Are potential producers of Lambda code other than the frontend of the OCaml compiler going to be upset if we made such a change?

lpw25 · 2019-09-26T10:40:30Z

I doubt anyone is relying on the ability of lambda code to support value recursion, but if they were they could just use Dissect_letrec themselves in place of their existing letrecs.

chambart · 2019-09-30T10:03:42Z

I'm not sure that the transformation is doable in translcore. The rec-check allows nested let-recs that I think can't be compiled out of context. This means that we need to build them top to bottom, I think this is not really doable in translcore where we don't have the whole expression.

lpw25 · 2019-09-30T15:17:50Z

I suspect that the nested ones will still work. I think that the inner let-recs need to be "dissectable" on their own to pass the rec-check. Then the outer let-recs just look at the results of "dissecting" the inner ones when "dissecting" themselves -- rather than looking at original inner let-recs as they do now.

I would expect that to basically involve adding cases for caml_alloc_dummy and caml_alloc_dummy_float to Dissect_letrec.prepare_letrec. That would probably be best done by making those C primitives into normal primitives -- but that would be an improvement anyway.

AltGr · 2019-10-07T12:34:01Z

One issue that would make handling this bottom-up difficult, I think, is that different levels of let-recs can be interleaved. More specifically, an inner let-rec could mix mutually recursive members, and independent members that are recursive with an outer let-rec.

E.g. a good rewrite should rewrite the following

let rec f () = g ()
and g () = f ()

and

let rec f =
  let rec h () = g () in h
and g () = f ()

in the same way. Untangling top-down allows us to do that.

Now, if we need to go bottom-up, I am afraid it would be very difficult to extract the remaining inner let-recs of functions and place them back at the outer level (if needed only, i.e. if mutually recursive). If we fail to do that, the inner let-rec could still contain calls to functions from the outer let-rec, thus cannot be lifted in an alloc + update phase.

lpw25 · 2019-10-09T12:33:44Z

I don't quite see why:

let rec f =
  let rec h () = g () in h
and g () = f ()

would be difficult bottom-up. Wouldn't running dissection on:

let rec h () = g () in h

leave it unchanged, and then the code would behave the same as it does now?

AltGr · 2019-10-15T10:16:27Z

Sorry if that wasn't a perfect example. The issue is, I don't think it's true that the lambda still respects the rec-check once it has been rewritten, making it more difficult for the outer letrec to merge the expected functions with the inner letrec.

While it may seem OK at first, it is possible to make this fail on more convoluted examples:

type t = T0 | T1 of t | Tf of (unit -> t)

let rec a = T0
and b =
  let rec f () = T1 a
  and x = Tf f
  in x

The key here is that we are using a value recursive with the outer let-rec to define a function in an inner let-rec.

The proposed top-down rewrite gives the following lambda:

(let (a/84 = 0a x/87 = (caml_alloc_dummy 1))
  (letrec (f/86 (function param/88 (makeblock 0 a/84)))
    (seq (seq (caml_update_dummy x/87 (makeblock 1 f/86)) 0a)
      (makeblock 0 a/84 x/87))))

where the let-recs are flattened and handled in a single pass (where the aliasing of b and x is detected).

Now, a bottom-up rewrite would first transform the inner let-rec into:

(let (x/87 = (caml_alloc_dummy 1))
  (letrec (f/86 (function param/88 (makeblock 0 a/84)))
    (seq (caml_update_dummy x/87 (makeblock 1 f/86)) x/87)))

A let-rec of functions remains here, but entangled between the allocation and updates. When rewriting the outer let-rec, we will then pull the allocation of x in thepre, and the functions remaining in the let-rec will be merged with the outer let-rec of functions (since there is mutual recursion).

But where to put the effects ? Putting them in pre, before the functions, as is normally done, would be wrong, since the effect refer to it:

(let (a/84 = 0a x/87 = (caml_alloc_dummy 1))
  (seq (caml_update_dummy x/87 (makeblock 1 f/86))
                                            ^^^^ referred to before being defined
    (letrec (f/86 (function param/88 (makeblock 0 a/84)))
      (makeblock 0 a/84 x/87))))

But even having a special case that would push the caml_update_dummy afterwards would be wrong in general: we just need to add a non-recursive element in the inner let-rec, and dereference it before returning the value for b:

let rec a = T0
and b =
  let rec f () = T1 a
  and x = Tf f
  and y = ref "foo"
  in !y, x

Now the rewritten inner let-rec has

(seq (caml_update_dummy x/87 (makeblock 1 f/86))
     (caml_update_dummy y/88 (makemutable 0 "foo")))

When trying to define the outer let-rec, we would need to put the second caml_update_dummy before the function definitions, and the first one, after. I don't think this would be possible to handle correctly without resorting to a topological sort at this point; this would complicate things a lot, and probably affect the evaluation order.

While I certainly agree that it would be nice to be able to encode this simplification directly in the lambda type, as far as making the connection with the rec-check is concerned, I think the proposed approach makes things easier since both are top-down.

which rewrites and lifts letrecs of values, and limits the Letrec construct to recursive functions.

that no longer apply with the dissected let-recs

lpw25 · 2019-10-17T08:59:01Z

Ok, IIUC you are basically saying that the following is not valid by the let-rec check:

type t = T0 | T1 of t | Tf of (unit -> t)
type s = { mutable t : (unit -> t) option }

let rec a = T0
  and b =
    let xr = { t = None } in
    let rec f () = T1 a in
    xr.t <- Some f;
    xr

and the algorithm implemented here could not handle such things in general. Since this is essentially the form of the output of let-rec dissection, that dissection cannot be applied bottom-up.

This makes sense. However, note that the above code obeys both:

It does not inspect any variable bound by the let-rec.

and:

If the value can't be preallocated (it can be because the
size can't be statically known, or because the value is not
allocated, like integers), then the value does not
depend on any other rec-bound variables.

So I think that means there is a missing condition in the comment which was copied out above. It's probably important to make sure we get all of those conditions right because they the interface between this code and the let-rec check.

AltGr · 2019-10-17T10:08:00Z

Ah, I see your point indeed. "Dereference" (or here "inspect") is to be used in a broader sense than usual here, as defined by the rec-check: we understand <- as dereferencing both sides in this context, which leads, in the analysis, b to transitively dereference a in your example.

We should probably be more careful in the comment explaining the pre-condition.

lthls · 2019-11-28T16:08:13Z

We're relying on this (to some extent) in our flambda2 work, and we would like to see some independent review of the code.
@gasche do you think you could find some time (or some other person) to look at this code ?
It is quite complex, but the main reason why it is so complex is that the simpler approaches we tried turned out to be wrong on some corner cases (these corner cases are added as tests by this PR).
Some of this complexity could probably be removed by restricting the let-rec check further, if you feel like it.

gasche · 2019-12-01T21:23:48Z

I would like to find some time to review this, but I failed to do so since the PR was posted. The most efficient way to advance would be to meet in person to discuss it. Can you grab people interested in doing so, and send me an email to organize this?

gasche

I met @lthls, @AltGr and @chambart today and we started going through the code, this is my first batch of review comments.

gasche · 2020-01-20T14:28:51Z

lambda/dissect_letrec.ml

+     ...
+     let p = expr in
+     (* [pre]: Values that do not depend on any other from the let-rec in
+        [Rec_check.Dereference] position *)


All Rec_check-accepted definitions should satisfy the property that no recursive binding Dereference-depends on any other, so it is not clear to me what this condition enforces.

Note: the neighbor functions are not bound yet (they are the f, g below), so in fact the expr can only reference (non-function) value neighbors, and cannot reference their function neighbors.

@AltGr and @chambart say that all (most of?) the user-visible side-effects of the letrec-bindings are moved to be in this group, and the v_contents mentioned below in the comment are in fact not doing observable effects (their effectful parts have been split/lifted into the [pre] block); so maybe the ([effects]) comment below is wrong/confusing.

gasche · 2020-01-20T14:43:31Z

lambda/dissect_letrec.ml

+     (* [consts]: constants *)
+     ...
+     let v = caml_alloc_dummy n in
+     (* [blocks]: With n the staticaly known size of blocks *)


Those must never receive functions (the v_contents cannot be functions) as the starting premise of the work is to avoid having an instruction to patch function values. One way to think of it is that functions do not have a static size. (The current code does consider that functions values have a known static size, by counting free variables, but this is not robust to different closure-representation or optimization approaches.)

gasche · 2020-01-20T15:00:03Z

lambda/dissect_letrec.ml

+     let rec g x = h (x+1)
+     and h x = i x
+     in
+     fun x -> g x


Here we want to lift g and h out as toplevel bindings, so that f can be part of the letrec of functions (we don't want to backpatch functions anymore). The definition of f might dereference g or h, but a nice property holds: if f dereferences a local binding, then this local binding cannot in turn refer (mention at any mode) one of the toplevel neighbors of f, otherwise f would dereference that neighbor transitively and the definition would invalid. So:

either f dereferences the local let, which can be lifted in the pre (it does not refer to function neighbors)

or f does not dereference it and it is valid to include it in the letrec nest (if it is a function)

Note: this is why we need to lift local-lets at the toplevel (the part of the transformation which intuitively I don't like so much), to expose that f is really a function.

gasche · 2020-01-20T15:05:21Z

lambda/dissect_letrec.ml

+   However, there might be (a subset of) a local [let rec] that is indeed
+   internally recursive, but that is used by the top-level [let rec] in
+   [Dereference] positions. We carefully lift those within [pre], and handle
+   them recursively in the same pass to preserve the order of evaluation.


Why is it possible to preserve the evaluation order? The generated code shape is

let c = const in (* [consts]: constants *) ... let v = caml_alloc_dummy n in (* [blocks]: With n the staticaly known size of blocks *) ... let p = expr in (* [pre]: Values that do not depend on any other from the let-rec in [Rec_check.Dereference] position *) ... let rec f x = ... and g x = ... and ... in (* [functions]: All the functions from the let-rec *) caml_update_dummy v v_contents; (* Backpatching *) ...

The recursive-function block does not perform any observable effect. The effects must happen in either [pre] or [v_contents], the trick is to make all effects happen in [pre] to preserve the evaluation order. Intuitively, this should be possible (@chambart claims) because all effect operations impose mode Dereference on their arguments, so they cannot really refer to the recursive functions. (They can be inside expressions that do, but the effectful subexpressions can be lifeted in [pre] in an order-preserving way. They claim.)

gasche · 2020-01-20T15:23:42Z

lambda/dissect_letrec.mli

+
+type dissected =
+  | Dissected of Lambda.lambda
+  | Unchanged


The Unchanged information matters because we are doing rewriting from the root to the leaves. If the input is exactly a letrec of functions, we return Unchanged, and then the caller knows that it has to recursively rewrite the subdefinitions. If Dissected is returned, you call yourself on the transformation result. If we turned Unchanged into Dissected <input>, there would be an infinite transformation loop.

gasche · 2020-01-20T16:43:46Z

lambda/dissect_letrec.ml

+      match current_let with
+      | Some cl -> build_block cl (List.length args) Boxed_float lam letrec
+      | None -> dead_code lam letrec
+    end


The two cases would be simpler written as:

| Lprim ((Pmakeblock _ | Pmakearray _) as prim, args, ) -> begin match current_lent with | None -> ... | Some cl -> build_block ... (match prim with ...) ... end

gasche · 2020-01-20T16:45:58Z

lambda/dissect_letrec.ml

+               assert false
+           | Types.Record_float ->
+               build_block cl size Boxed_float arg letrec)
+      | None -> dead_code lam letrec


this case could be first

gasche · 2020-01-20T16:57:11Z

lambda/dissect_letrec.ml

+      else
+      let recursive_set = Ident.Set.add id recursive_set in
+      let letrec = prepare_letrec recursive_set current_let body letrec in
+      let let_def = { let_kind; value_kind; ident = id } in


let_def => current_let

gasche · 2020-01-20T17:05:15Z

lambda/dissect_letrec.ml

+             recursive variable *)
+          let pre ~tail : Lambda.lambda =
+            Llet (current_let.let_kind, current_let.value_kind,
+                  current_let.ident, lam, letrec.pre ~tail)


Note: with this definition, the call that runs last performs its effects first. This explain the slightly-odd implementation of the Lsequence(lam1, lam2) for example, that looks essentially like

let letrec = dissect lam2 in let letrec = dissect lam1 in letrec

(state-passing-style, in reverse order)

If we had written instead

let pre ~tail = letrec.pre (Llet (..., ~tail))

then we could write the calls in the order of evaluation, instead of the reverse order.

(It might be nice to try this to see if it's a big change and/or if it really makes the code nicer.)

gasche · 2020-01-20T17:10:07Z

lambda/dissect_letrec.ml

+
+         One solution would be to handle now the outer bindings, and lift the
+         inner ones into [letrec.pre], to be handled in a second pass. That
+         would change the evaluation order, though, so we instead descend


It is not clear to me why this changes the evaluation order. Generally the rest of the code assumes that any code at the root of the current lam argument can be safely (evaluation-order-preservingly) moved into pre, and here it seems to not be an issue.

Maybe the problem comes from splitting these inner-bindings in two groups (one for pre and one to add to the current workset), which imposes an evaluation order between the two groups that is not the source order?

gasche · 2020-01-23T16:37:25Z

Between the first and second round of review, @lthls found a bug in the interaction between dissection (as in this PR) and the Return usage allowed by the recursive check (rec-check). We discussed what's a good solution for this.

We decided that the best approach is to change rec-check to forbid certain patterns, instead of making dissection more complex or rec-compilation less efficient in corner cases. We have a good idea of how to restrict the rec-check in this way, but this requires a bit of work, before which the current PR should not be merged (it is unsound with respect to the current check). This means that the current PR should be considered paused, as we work on the rec-check.

For the record the bug has the following reproduction case (currently works well, but fails if dissected with this PR):

let rec f =
  let g = if true then f else f in
  fun x -> g x

Here f is a static function, but the dissection would try to lift g in the [pre] set, which is wrong as it depends on f. For simpler forms of local definitions, recursive dissection and the alias mechanism of the PR would work, but this one (which coincides with the notion of "Dynamic"-size expression) makes it fail.

chambart and others added 15 commits October 15, 2019 14:58

Add the dissect_letrec module in lambda

6ffcfd2

which rewrites and lifts letrecs of values, and limits the Letrec construct to recursive functions.

Fix classes

c2cd21f

Typos

ff625de

Dissect_letrec is now working

6e03ee5

Restore letrec evaluation order

bca2ceb

Fixed nested letrec handling

6c52d35

A bit of cleanup & tests

b5b0d3f

dissect letrec: fix final assert check

df72def

Dissect cleanup

ee80cfd

Field renaming

632b639

Disabled parts of the 'is_static_flambda' test

6b9b936

that no longer apply with the dissected let-recs

Style fixes and Changes entry

c7f6a47

Remove temporary debug code

bed2508

Fixed dead code

fde9738

Add some more torture tests for dissect-letrec

3b34fd4

AltGr force-pushed the dissect-letrec branch from 9e0fe4b to 3b34fd4 Compare October 15, 2019 13:23

lpw25 mentioned this pull request Oct 17, 2019

Miscompilation of recursive value #9048

Closed

AltGr mentioned this pull request Oct 17, 2019

some accepted recursive values exhibit erroneous behavior with unexpected shallow copies #7768

Closed

gasche reviewed Jan 20, 2020

View reviewed changes

lthls mentioned this pull request Jan 24, 2020

Fix soundness bug in Rec_check with anonymous let module #9261

Merged

lthls mentioned this pull request May 7, 2020

Bad interraction between let-rec and no-naked-pointers #7718

Closed

stedolan mentioned this pull request May 18, 2020

Avoid creating ill-formed blocks in Cmm letrec #9577

Merged

lthls mentioned this pull request Mar 1, 2023

Correct size for recursive values with branching #12059

Merged

lthls mentioned this pull request Apr 1, 2023

Bytecode miscompilation of recursive definition of empty first class module #12153

Closed

lthls mentioned this pull request Sep 14, 2023

Ensure that Static recursive bindings always go through pre-allocation lthls/ocaml#8

Open

lthls mentioned this pull request Sep 26, 2023

Add Constant and Class classifications for recursive bindings #12608

Merged

Conversation

AltGr commented Sep 20, 2019

Uh oh!

gasche commented Sep 21, 2019

Uh oh!

gasche commented Sep 21, 2019

Uh oh!

AltGr commented Sep 23, 2019

Uh oh!

chambart commented Sep 23, 2019

Uh oh!

lpw25 commented Sep 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshinwell commented Sep 26, 2019

Uh oh!

lpw25 commented Sep 26, 2019

Uh oh!

chambart commented Sep 30, 2019

Uh oh!

lpw25 commented Sep 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AltGr commented Oct 7, 2019

Uh oh!

lpw25 commented Oct 9, 2019

Uh oh!

AltGr commented Oct 15, 2019

Uh oh!

lpw25 commented Oct 17, 2019

Uh oh!

AltGr commented Oct 17, 2019

Uh oh!

lthls commented Nov 28, 2019

Uh oh!

gasche commented Dec 1, 2019

Uh oh!

gasche left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gasche commented Jan 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

lpw25 commented Sep 26, 2019 •

edited

Loading

lpw25 commented Sep 30, 2019 •

edited

Loading