tentative fix for letrec check error (MPR#7706)#1565
Conversation
|
I think this approach is probably too restrictive. It's definitely a regression compared to OCaml before the new check was added. I think that a simple environment mapping local bindings to |
|
@lpw25 I wasn't sure whether that was more restrictive than the previous check, but the testsuite agrees with you. The following were accepted and fail with the restriction: (* letrec/allowed.ml *)
let rec x = let y = (x; ()) in y;;
(* letrec/nested.ml *)
let rec r = (let rec x = `A r and y = fun () -> x in y)Clearly for both of these I need to keep a environment to know, for each local binder, whether it was On the other hand, I'm not sure I understand the suggestion to consider non-local identifiers as |
I can see two ways to reason about this. One is that the condition is really that the size of the resulting value needs to be known before executing the recursive binding -- which holds for a non-local identifier since the size can be read from the header. Another way of looking at it is that the reason we need to know the size is so that we can pre-allocate the value and then copy the fields of the result of the binding into this pre-allocated value -- for a non-local identifier we can skip this pre-allocation and just use the existing value itself. |
Thinking about this more, the answer is a bit different: the "size" of an old binding is always 1. Indeed, all OCaml values have size 1; the only reason why we may have non-1 size is that the values currently being recursively defined share a single block, so we are talking about their in-block layout. Old references may be referenced but are never inlined within the block (that would make no sense), so their size is 1. |
|
I think what you're saying corresponds to my second argument. Although this:
seems slightly wrong to me: we must talk about their in-block layout because they need to be pre-allocated. They need to be pre-allocated so that their actual value -- the address of a block -- is available before executing the recursive binding, even though the contents of this block are not yet ready for inspection. |
|
I also have a slightly more subtle example that produces incorrect results: # let r = ref None;;
val r : '_weak1 option ref = {contents = None}
# let rec x =
let s = { contents = 5 } in
r.contents <- Some s;
let _ = x in
s;;
val x : int ref = {contents = 5}
# match !r with
| None -> assert false
| Some s -> s == x;;
- : bool = falseHere the problem is that |
|
I think that the result is incorrect not because of mutable variables escaping, but because of the compilation scheme that is wrong -- because it considers that the static size of |
I don't think the compilation scheme is wrong, it's just not complete. Not every possible recursive definition can be compiled using it, but doing better seems like a lot of effort. Note that the compilation scheme always allocates recursive definitions twice -- but that doesn't matter if the value is immutable or cannot escape. |
|
Right, I misunderstood above, the compilation is as expected. |
|
Ok, I pushed an implementation that carries a local environment. I believe that all tests should now pass correctly -- in particular, the dangerous example is rejected. I keep defaulting to @lpw25, I spent some time trying to find an example of a program that may be unfairly rejected by considering non-local identifiers as Unless I am missing something, guarded/delayed uses that are ignored can always be erased, so all such bodies are equivalent to just |
|
@lpw25, @garrigue, @yallop: I would be interested if one of you had a look at the proposed implementation, for which some details may still need to be fleshed out. Does the approach look correct and would you be willing to merge a fleshed-out version of the fix? I left three |
|
Thanks for looking at this, @gasche! I'll try to review later this week. |
|
This fix looks sensible to me. I'm looking forward to the day when the information produced from this analysis is passed through and used in compilation, rather than simply attempting a sound but unconnected approximation of the compilation scheme that is immediately discarded.
Does leaving module structures untraversed cause any concrete problems?
It would be useful to distinguish "There is opportunity for refinement/improvement here" and "This is somehow wrong and should be changed". |
| | Texp_letexception (_, e) -> classify_expression e | ||
| | Texp_ident _ | ||
| | Texp_letexception (_, e) -> | ||
| classify_expression env e |
There was a problem hiding this comment.
There are several changes here that are purely about style (indentation, etc.). Perhaps they could be put into a separate commit.
f4d78e3 to
b54e47a
Compare
|
I went back to this PR and made the suggested changes. I looked into dealing with module-level declarations, and gave up, because it's a lot of work and no motivated by an actual need so far. I removed the scary-looking FIXME wordings, and split the commit in two as suggested. As far as I am concerned, and barring any CI surprise, this should be good to go. |
|
The CI fails and I have no idea why: I cannot reproduce the issue on my machine. A CI log is available at https://api.travis-ci.org/v3/job/346054625/log.txt. Excerpt: |
|
@gasche you added a "pr7706.ml" entry to an |
|
That's certainly the issue, thanks. I rebased the PR and will look into printing an error message from ocamltest to help debuggability next time. |
|
The problem was showing in the log but I missed it: |
|
Well as far as I can tell, ocamltest does print messages both when no
file is given and when the file that is given does not exist.
Perhaps the message is lost when ocamltest is run by the makefiles. I
quickly checked but I'm not sure what's wrong.
One thing that may help would be to:
- Make sure ocamltest's stderr is properly redirected to a file
- If it fails, cat this file to stdout so that it is part of the build
log made available by CI
|
|
I proposed a PR fixing the user-interface issue in #1634. |
|
Would someone be available to do a formal review of this PR? ( I'm adding @chambart to the conversation, as he worked on recursive values recently, but don't expect him to actually have time. ) |
|
@lpw25 would you be able to review this PR? (I just rebased it.) |
lpw25
left a comment
There was a problem hiding this comment.
Looks good to me. It might be good to add my nastier example from an earlier comment as a test.
|
Actually, whilst I approve this fix for MPR#7706 I think my other example is still going to be wrong with it. |
|
The CI was broken (due to unrellated change), so I just rebased the PR again. @lpw25: could you open a new MPR to track this issue? I don't see a nice way to rule this example out without forbidding things that look like they should be allowed, such as let rec thunk = ref (fun () ->
let v = f () in
thunk := (fun () -> v);
v)Besides, if I understand correctly your example is in the category of "there is a shallow copy of the data that I didn't expect", which is different from an unsound generated code -- and I don't think can break type-soundness. |
The typechecker-level check for recursive value depends on whether recursive values use a memory size that is statically known or depends on their dynamic evaluation. An error in this check results in a potential segfault. The error is that the check currently considers all variables/identifiers to have a statically-known size. This is certainly wrong for locally-defined identifiers, that may be bound to dynamically-sized expressions. We rule this case out by carrying an environment during size-checking, that remembers the size (static or dynamic) of local bindings. The implementation is incomplete in certain ways, but safely defaults to Dynamic as the size of bindings that it does not track through the environment.
|
I think the fix is quite easy:
type sd = Static of Asttypes.mutable_flag | Dynamicmarking the static constructs that create mutable things as mutable. |
|
(You should feel free to merge this anyway as is, since it is a clear improvement) |
|
I'm not sure that your proposal is correct: I find it unpalatable to handle different kind of subexpressions differently. Maybe there is a wider design that does some more systematic escape analysis that you are proposing a sound sub-analysis of, but maybe this is just a patch for this one example and there are other broken examples of this nature that you are not preventing. |
|
It just relies on the fact that expressions which are not let bound are used linearly, which the current check is also relying on. |
|
(Although, my understanding is that Pierre is planning to change all of this |
It's easy for me to play the devil advocate because I don't have the details of the checks in memory. Is this true? What about applications |
|
I submitted a bugreport for Leo's issue at MPR#7768. |
Applications are always |
|
Re. your comment in the mantis issue:
I do agree this is getting somewhat out of hand. Probably we should just have a simpler more restrictive set of rules for "static" and break backwards compatibility. |
* Start of implicit-source-positions Expand [%src_pos] to hard-coded record (ocaml#1565) * Update codeowners * Revert "Update codeowners" This reverts commit 313cb2d1415a34971ea6e3cbd02d008e255ca888. * Start * backup * Hacky checkpoint with hard-coded record * Rename test * Shadowing works as intended * Simpler test case * Fix label layouts * Change name to lexing_position, fix record layout, add to shadowing test * Add to test * Formatting * Tests, fix predef lexing_position layout, add CR for untypeast shadowing * Fix test * Fix untracked file * Revert "Fix untracked file" This reverts commit 2a18a6aef66322e4e897aef29ace14eda25b7785. * reformat * Bootstrap, with predef lexing_position --------- Co-authored-by: Richard Eisenberg <reisenberg@janestreet.com> Separate `arg_label`s for Parse and Typed trees (ocaml#1589) * src_pos in fn decls and defns * Revert is_optional modifications for now * src_pos shouldn't work in arbitrary places * Tidy code * Whitespace * Add test * Comment on documentation * Whitespace * more whitespace * Amend test to use Position * Correctly print [%src_pos] rather than lexing_position and fix corresponding test * Store hard-coded string in variable * Separate parsed and typed labels * Delete fn_decl_and_defn.ml * Delete invalid_usages.ml * Minimize diffs in PR * Minimize diffs in PR * Update named_function.ml for less diffs * Clean up * Put arg_label in Outcome tree * Revert "Put arg_label in Outcome tree" This reverts commit cd5ace530c602975bac5f259e7d078a52dfabdaa. * is_optional for Parsetree arg_labels * Restore Asttypes.arg_label, re-export arg_label from Types * Document distinction between arg_labels * Fix whitespace * Add todo comment Add `arg_label` to `Otyp_arrow` (ocaml#1644) * Put arg_label in Outcome tree * Add comment for Outcometree.arg_label `[%src_pos]` in fn decls and defns (ocaml#1579) * src_pos in fn decls and defns * Revert is_optional modifications for now * src_pos shouldn't work in arbitrary places * Tidy code * Whitespace * Add test * Comment on documentation * Whitespace * more whitespace * Amend test to use Position * Correctly print [%src_pos] rather than lexing_position and fix corresponding test * Store hard-coded string in variable * Hacky outcometree printing without creating nontrivial node * Update documentation for Position labels * Update tests * Fix erroneously printing lexing_position for Position arguments * Clean up code * Reconstruct constraint for Position arguments in Untypeast * Update tests * Move comments around * Slightly prettify * Use Location.none in Untypeast reconstructed extensions * Add Ttyp_src_pos to Typedtree * Add CR * Comments * Clean up code * Update comment in printtyp.ml * Add documentation for Ttyp_src_pos, remove extraneous comments * *synonyms.ml * src_pos in fn decls and defns * Revert is_optional modifications for now * src_pos shouldn't work in arbitrary places * Tidy code * Whitespace * Add test * Comment on documentation * Whitespace * Correctly print [%src_pos] rather than lexing_position and fix corresponding test * Store hard-coded string in variable * Hacky outcometree printing without creating nontrivial node * Update documentation for Position labels * Update tests * Fix erroneously printing lexing_position for Position arguments * Clean up code * Reconstruct constraint for Position arguments in Untypeast * Update tests * Move comments around * Slightly prettify * Use Location.none in Untypeast reconstructed extensions * Add Ttyp_src_pos to Typedtree * Add CR * Comments * Clean up code * Add documentation for Ttyp_src_pos, remove extraneous comments * *synonyms.ml * Merge cleanup * Add label to Octy_arrow * Rename function * transl_label checks for Position * Pass None to transl_label for classes * Add comment * Delete comment * Consider Position when approximating type * Add tests for application, recursion * Error messages mention Position * Prettify * Comments * Rename function * Add comment * Remove extraneous calls to label translation * Test type-based disambiguation * Add comment for fn app labels * Add commuting tests * Remove duplicated src_pos match logic * Reduce instances of src_pos string * src_pos in fn decls and defns * src_pos shouldn't work in arbitrary places * Whitespace * Correctly print [%src_pos] rather than lexing_position and fix corresponding test * Store hard-coded string in variable * Hacky outcometree printing without creating nontrivial node * Clean up code * Slightly prettify * Add CR * Clean up code * Update comment in printtyp.ml * Merge cleanup * Add label to Octy_arrow * Rename function * transl_label checks for Position * Pass None to transl_label for classes * Add comment * Delete comment * Consider Position when approximating type * Add tests for application, recursion * Error messages mention Position * Prettify * Comments * Rename function * Add comment * Remove extraneous calls to label translation * Test type-based disambiguation * Add comment for fn app labels * Add commuting tests * Remove duplicated src_pos match logic * Reduce instances of src_pos string * Make things is_optional again * Add commuting tests * Add transl_label_from_pat * Whitespace * Parenthesize src_pos in error message * Fix test Create `Lexing.position` values when `[%src_pos]` in expressions (ocaml#1661) * Everything * checkpoint Construct lambda for src_pos directly * Revert now unneeded changes to predef * Clean up comments, whitespace Implicitly supply source position arguments (ocaml#1671) * Everything * Apply position arguments when expected type is nothing, refactor creating src_pos exprs * Add warning instead of modifying existing one * Fix test * Move test * Resolved comments Clearer classic mode label equivalence checks (ocaml#1692) * Everything * Apply position arguments when expected type is nothing, refactor creating src_pos exprs * Add warning instead of modifying existing one * Fix test * Move test * Resolved comments * Classic mode equivalence check * Refactor * Implicit source position merge conflicts (ocaml#2275) * Implicit Source Positions Conflict resolution This feature solely fixes merge conflicts that the implicit source positions project had after not having been rebased since the summer. Sadly, I mistakenly committed a new test! I meant to do it on a different branch, but I accidentally committed some more changes after that making it trickier to split... Please let me know if I should only keep this feature do "merge conflict resolution". Thanks! Testing ------- - I ran `make test`, and it passes. - `make install` works, I migrated bonsai to start using this and it works! - Added test on let operators. - Sadly it doesn't quite work, but I think it would be cool if it did work as it would allow codebases that use let+ and let* to get source code locations! I am unsure about its sound-ness though and left CR src_pos'es/question CR's. Please let me know if there is other additional testing/ci I should perform. Thanks! I also performed additional cleanup in this feature from self-questions I had during the merge. Please let me know if there are any changes I should perform. Thanks! * More cleanup after self-review * Made CI on Github pass - CI seems to run `make ci` instead of just `make test`. I've fixed more `make ci` changes, although not all, pushing to let ci run in case the current "compiler-libs.common" being missing is due to a misconfiguration on my environment/if it also fails on the github ci being missing is due to a misconfiguration on my environment/if it also fails on the github ci. - Performed a bootstrap - Fixed chamelon - Additionally turned my questions on let operator support into CR src_pos: * Removed empty file added accidentally * Updated let_operator support comment * Widened question on chamelon compatibility * Implicit source positions object system support (ocaml#2307) * Implicit Source Positions Conflict resolution This feature solely fixes merge conflicts that the implicit source positions project had after not having been rebased since the summer. Sadly, I mistakenly committed a new test! I meant to do it on a different branch, but I accidentally committed some more changes after that making it trickier to split... Please let me know if I should only keep this feature do "merge conflict resolution". Thanks! Testing ------- - I ran `make test`, and it passes. - `make install` works, I migrated bonsai to start using this and it works! - Added test on let operators. - Sadly it doesn't quite work, but I think it would be cool if it did work as it would allow codebases that use let+ and let* to get source code locations! I am unsure about its sound-ness though and left CR src_pos'es/question CR's. Please let me know if there is other additional testing/ci I should perform. Thanks! I also performed additional cleanup in this feature from self-questions I had during the merge. Please let me know if there are any changes I should perform. Thanks! * Moves changes from original class-type support branch into a rebased branch * Removed lingering merge conflict markers * More tests + manually moved a commit from the original class type branch - For some reason I originally missed a commit that typed the argument on classes from the original branch. This feature also grabs the tests. Some of my questions revolving `Principal` are no longer needed as they seem to have disappeared! I suspect that `make ci` now passing in the parent feature is partly/transitively responsible somehow for `Principal` now no longer showing up. * Fixed incorrect merging of invalid_usages.ml * Removes weird whitespace observed after self-review * More tests! Found out that application on an inheritance call is unhandled! * Explicit passing positional argument in a pcl_apply works, erasure still does not. * Added Pcl_apply support * More cleanup + removed a question cr from the parent feature. * More tests. Found another bug! Class type arrows seeem to be left untranslated... * Fixed type annotation bug + added more tests * Removed duplicated test * Added more tests + fixed weird whitespace * Added question on the two class system environments (val_env vs. met_env) * Deduplicated more code after self-review * minor whitespace change * Removed resolved question CR and addressed new CRs * Implicit source positions directory locations (ocaml#2346) * Implicit Source Positions Conflict resolution This feature solely fixes merge conflicts that the implicit source positions project had after not having been rebased since the summer. Sadly, I mistakenly committed a new test! I meant to do it on a different branch, but I accidentally committed some more changes after that making it trickier to split... Please let me know if I should only keep this feature do "merge conflict resolution". Thanks! Testing ------- - I ran `make test`, and it passes. - `make install` works, I migrated bonsai to start using this and it works! - Added test on let operators. - Sadly it doesn't quite work, but I think it would be cool if it did work as it would allow codebases that use let+ and let* to get source code locations! I am unsure about its sound-ness though and left CR src_pos'es/question CR's. Please let me know if there is other additional testing/ci I should perform. Thanks! I also performed additional cleanup in this feature from self-questions I had during the merge. Please let me know if there are any changes I should perform. Thanks! * More cleanup after self-review * Made CI on Github pass - CI seems to run `make ci` instead of just `make test`. I've fixed more `make ci` changes, although not all, pushing to let ci run in case the current "compiler-libs.common" being missing is due to a misconfiguration on my environment/if it also fails on the github ci being missing is due to a misconfiguration on my environment/if it also fails on the github ci. - Performed a bootstrap - Fixed chamelon - Additionally turned my questions on let operator support into CR src_pos: * Removed empty file added accidentally * Updated let_operator support comment * Widened question on chamelon compatibility * Moves changes from original class-type support branch into a rebased branch * Created branch with working changes for directory positions. * Added a test sanity checking being able to pass in the flag. * Changed test to properly test that the right basename is supplied * Removed merge conflict markers upon self review. * Removed lingering merge artifacts upon self review * Renamed flag from -dir to -directory * Fix typo * Removed directory flag from ocamldoc options * [Implicit Source Positions] - Better Error Messages (ocaml#2364) * Rename src_pos -> call_pos Also left some self notes regarding the remaining CR src_pos * Improved error messages for %call_pos * Addressed a CR src_pos and removed an already addressed CR src_pos * Cleanup after self-review * Updated missing rename after self-review * Removed addressed c r questions * Fixed merge conflicts. `make ci` passes locally * bootstrap --------- Co-authored-by: jose r <45022810+Enoumy@users.noreply.github.com> Co-authored-by: enoumy <enoumy@gmail.com>
See MPR#7706, this definition is unsafe and must be rejected (but is currently accepted):
The typechecker-level check for recursive value depends on whether
recursive values use a memory size that is statically known or depends
on their dynamic evaluation. An error in this check results in
a potential segfault.
The error is that the check currently considers variables/identifiable
to have a statically-known size. This is certainly wrong for
locally-defined identifiers (that may be bound to
dynamically-sized expressions), but it is not even clear why that
would be desirable for other expressions of the nested letrec, or for
variables from the context.
This patch considers all identifiers to have dynamic size.