Correct size for recursive values with branching#12059
Conversation
gasche
left a comment
There was a problem hiding this comment.
After having just a quick look: I agree that this is a good approach to fix the specific bug reported.
However the current state remains fairly fragile, with a later pass making assumptions about an earlier pass without checking them, and a potential soundness bug each time we get them wrong. This remains unsatisfying.
bytecomp/bytegen.ml
Outdated
| | None -> RHS_unreachable | ||
| | Some act -> size_of_lambda env act) | ||
| cases | ||
| | _ -> RHS_nonrec |
There was a problem hiding this comment.
should we maybe consider a non-fragile pattern here to see what we are missing out?
There was a problem hiding this comment.
I agree with you, but it takes a lot of time to understand if any given expression is safe or not so I thought submitting a PR with straightforward fixes for a known, reported bug should take precedence.
You can find some hints of why things are complicated in the very long PR description.
|
A very naive idea -- recursive values are completely out of my memory so this is probably silly. Could we solve the problem of mismatch between the two passes by performing a "partial dissection" (going in the direction of the dissect let-rec work) at Rec_check time, by splitting the non-recursive Dynamic definitions out of the recursive nest? Then in the backends when we see a recursive nest we know that they must all be handled as Static, and fails with a compile-time error if it looks wrong. |
It's not silly, I think it is a good idea. It doesn't solve the tricky issues but it's one less thing to worry about. |
|
@lthls I looked at this again today and believe that the approach is reasonable to fix the specific bug of "hidden exceptions" with the current design of separate frontend and backend analyses of recursive value definitions. You mentioned however that there is work in progress to send more information from the frontend analysis to the backend. If I understand correct, that work would obviate the need for the current PR, giving a system that is both simpler overall and more robust. Do you think we should proceed with the current PR, as a first step for the time being, or close and wait for the "store information about the frontend analysis" work to be proposed for upstreaming? |
I would prefer going forward with this PR first. Let me find a bit of time to write the Cmmgen part of the patch, and I'll un-draft the PR. |
8c556b7 to
ae85d78
Compare
|
I've updated this PR. It now contains both the bytecode and native code versions of the patch. I can't be sure that there are no more bugs like this (there are still no checks that I don't think I mentioned it before, but I'll also note that this PR might produce worse code in some cases: when a non-recursive definition is marked as recursive, it's better to compile it as a non-recursive binding, but currently it only happens for definitions that have no obvious size. Since this PR allows to compute sizes in more cases, we will end up using the less efficient scheme in more cases. |
Changes
Outdated
| design by Gabriel Scherer, Xavier Leroy) | ||
|
|
||
| - #12032, #12059: Bug fixes related to compilation of recursive definitions | ||
| (Vincent Laviron, report by Github user @lupjo, review by Gabriel Scherer) |
There was a problem hiding this comment.
Minor: lupjo advertises Louis Noizet as a full name, which would be more in style with the Changelog.
There was a problem hiding this comment.
Ah, I didn't see that. I'll update the entry.
| have treated it as Static. We rely on the fact that the compilation | ||
| of non-recursive values (RHS_nonrec) already handles that case | ||
| correctly, and return RHS_nonrec. *) | ||
| RHS_nonrec |
There was a problem hiding this comment.
This is a different bugfix, right? I would rather have it in a separate commit with its own testsuite change.
There was a problem hiding this comment.
It's not a bugfix. It's the same behaviour as before the PR, but I think this case deserved its own comment, so I split it out.
| | Llet(_str, _k, id, arg, body) -> | ||
| size_of_lambda (Ident.add id (size_of_lambda env arg) env) body | ||
| | Lmutlet (_kind, _id, _def, body) -> | ||
| size_of_lambda env body |
| comp_init (add_var id (sz+1) new_env) (sz+1) rem | ||
| | (id, _exp, RHS_unreachable) :: rem -> | ||
| Kconst(Const_base(Const_int 0)) :: Kpush :: | ||
| comp_init (add_var id (sz+1) new_env) (sz+1) rem |
There was a problem hiding this comment.
If this is exactly the same as the RHS_nonrec case above, should we use an or-pattern?
bytecomp/bytegen.ml
Outdated
| | (_id, exp, RHS_unreachable) | ||
| :: rem -> | ||
| comp_expr stack_info new_env exp sz | ||
| (discard_dead_code (comp_rec new_env sz (i-1) rem)) |
There was a problem hiding this comment.
In the native backend you use exactly the same code generation for RHS_nonrec and RHS_unreachable toplevel definitions. Here in bytecode you handle them closer to the recursive definitions. Which approach is the right one?
I wonder if handling unreachable cases (really: aborting, non-returning) like nonrec cases could create issues of scoping in generated code for examples like let rec foo = (ignore bar; assert false) and bar = ().
There was a problem hiding this comment.
I can't remember why I didn't just handle RHS_unreachable as RHS_nonrec when I wrote the bytecode patch. It does look safer that way, and I certainly don't want to introduce useless optimizations for these cases.
There was a problem hiding this comment.
I am not completely sure that it is okay to always treat _unreachable as a non-recursive binding: morally I could see someone defining a binding that first ignores a mutually-recursive name and then raises/aborts. But I cannot get any such example to work, because all constructions that only raise/abort (including matching on an un-inhabited GADT with only a refutation clause) seem to be marked Dynamic in the rec_check, and therefore reject any use of a recursive name.
Given this observation, it does seem reasonable to assume that all _unreachable toplevel values are non-recursive.
There was a problem hiding this comment.
Doesn't let rec x = let _ = x in let 0 = 1 in 0 match your definition ? It always raises, uses its recursive binding, and is not rejected by Rec_check.
But anyway, RHS_nonrec already contains a lot of things that were considered Static by Rec_check, so it actually has to handle those cases more or less correctly.
This PR is a proposal for fixing issue #12032.
I've started with the bytecode version only, for design discussions, but will update the native version when we have agreed on the design.
Overview of the issue
Recursive definitions are quite tricky in OCaml, because while there are obviously definitions that make no sense and are rejected, and definitions that definitively make sense and should be allowed, there are also a lot of weird corner cases that do not make much sense but are allowed anyway (and also some definitions that could be allowed but aren't, but this isn't going to be relevant in this discussion).
Basic compilation scheme
The basic strategy for compiling recursive values is to compute the amount of memory that each of the recursive values will occupy, then have the compiler generate code that pre-allocates memory for each recursive binding (using
caml_alloc_dummy*functions), computes the actual values using the pre-allocated values as placeholders, and finally copies the resulting values back into their expected slot.This strategy is only possible if the computation of the actual values doesn't actually inspect the contents of the values being defined (which are not type-sound yet at this point), so the
Rec_checkmodule implements an algorithm that rejects definitions that do not meet this criterion.Non-recursive/Dynamic bindings
However, OCaml also supports recursive definitions where one or several of the bindings cannot be pre-allocated, on the condition that those bindings are not actually recursive.
For example,
let rec x = if cond then 0 :: x else []is forbidden butlet rec x = if cond then [0] else []is allowed.To handle this,
Rec_checkalso classifies recursive bindings as eitherStaticorDynamic.Staticbindings are the ones where we know that we can compute the size at compile time, whileDynamicones are the others (we don't know if we will be able to compute the size or not).Staticbindings are expected to go through the compilation scheme described above, so they are allowed as long as they do not inspect the content of a recursively-bound variable during initialisation.Dynamicbindings, on the other hand, are not allowed to refer to any other recursive binding (with some exceptions, that are not relevant here). This means that the compiler is allowed to first generate regular let-bindings for these dynamic bindings, then generate the code for the static bindings (pre-allocation, computation, copy).The bug: mismatch between
Rec_checkand the actual compiling codeRec_checkonly returns a boolean value (more or less). In particular, the static/dynamic status is never stored anywhere. So when doing the actual compilation, the compilers have to redo this computation (the actual size needs to be computed too at this point). This corresponds toBytegen.size_of_lambdafor the bytecode compiler, andCmmgen.expr_sizefor the native compiler.But these computations are not done on the same expressions as the original ones: some have been expanded (typically by the pattern-matching compiler), others may have been removed or simplified by an optimisation.
In one direction, this is not problematic. If a
Dynamicbinding has been simplified so that it now looks like aStaticbinding, then it's fine; the code generated will be slightly different (and less efficient), but it's sound.Problems start appearing when
Rec_checkdecides that a binding isStatic, but the compiler can't compute a size and assumes it must be have beenDynamic. In that case, the compiler assumes that this dynamic binding can be evaluated prior to the actual recursive bindings, butRec_checkhas allowed it to refer to recursive variables (including itself) that are not being preallocated. This won't lead to an error at compile time, because for other reasons the compiler inserts a dummy binding anyway even for dynamic bindings, but at runtime this dummy binding can end up leaking to the user, and since it is not type-sound bad things happen.The solution(s)
This PR: handle non-returning branches
The example from #12032 is one of a number of cases where
Rec_checkdoesn't see any branching construction, but some non-returning branches are inserted later and the size computations assume that it must be dynamic.This PR introduces a new constructor
RHS_unreachablein the size computations, that is used for non-returning branches. Branching instructions then compute the sizes of all their branches, and returnRHS_nonrecif there are more than one returning branch (assuming thatRec_checkwould have attributed aDynamicmode), or the mode of the size of the only returning branch if there is one.Possible extension: branches with the same size
This example currently fails to compile:
let rec x = if cond then 0 :: x else 1 :: x. Given that all branches return a value of the same size, it would be possible to compile it using the current scheme. In the size computations, it would be relatively straightforward support (on top of this PR, it's just a small modification to thejoin_rhs_kindfunction).However, for this to be actually useful we would need
Rec_checkto be able to compare sizes, which adds again more complexity to the algorithm (for some constructions like functions, we cannot even compute the size inRec_check).Constants: static or dynamic ?
There is one other known case where
Rec_checkwill claimStaticmode but the compiler will assume a dynamic definition: constants. This can take several forms, from basic integer or numeric constants, to arbitrary combinations of constructors (like(Some []) :: None :: (Some (1 :: 2 :: 3))). This also includes unit-returning expressions like loops and assignments.For allocated constants the compiler could compute the size and pre-allocate, but since by definition constants cannot refer to other variables it is safe to compile them before (note that things like
let rec x = let _ = Some x in 42are still allowed, so we still need dummy bindings).These cases are also causing trouble with our work on alternative let-rec compilation (#8956), so @chambart is looking at some better ways to deal with that.
Storing and propagating the result of
Rec_checkFinally, I've also considered whether recursive bindings should store their mode (static or dynamic) so that the compilers know whether it is safe to use
RHS_nonrecor if it must be pre-allocated. As discussed above, this needs some care in the case of constants (we cannot pre-allocate them, but we can pre-define them). Apart from that I think it would be doable, but looks like a lot of tedious work so I won't do it unless we agree in advance that it's the right solution.