non-exhaustive fix for the Pattern Matching Bug, take two#12556
non-exhaustive fix for the Pattern Matching Bug, take two#12556gasche wants to merge 8 commits intoocaml:trunkfrom
Conversation
…ompilation strategy
|
As I explain in #12555, these two PRs are alternative approaches to fix the same bug, and I prefer #12555. (I was not planning to submit this one but @smuenzel expressed interest.) In particular, my intuition is that #12555 could be merged without specific benchmarking, because it does not change the generated code in the most common cases. This one changes the generated code for any usage of records with mutable fields, in a way that could be beneficial, neglectible or negative. I don't have the energy right now to run an experimental evaluation of this. I would personally be in favor of going with #12555, and then fixing the "partiality information" issue (my fix for that is more complex than the two we are currently discussing, and will require more review time and energy) to fix #7241 for good. Then, if there is still appetite for fine-grained discussions of the code generation strategy for records with mutable fields, we can revisit the present PR and merge it later. |
|
In #12555 (comment), @smuenzel expressed himself in favor of the present approach ("strict bindings" rather than "context erasure"):
A note on the second paragraph: I like the idea of the "snapshot" semantics, as discussed earlier (in another monster comment from last June, #7241 (comment) ) -- the idea is that the pattern-matching code behaves as this the value was read in full at some point, and no part were re-evaluated after that. But it is important to note that the present PR only gets us closer to this snapshot semantics in some cases, and does not provide it in general. In particular, consider the two programs below, the first one will behave according to the "snapshot semantics", but the second will not -- it is currently unsound, but even with that other bug fixed it will raise Match_failure. match x with
| { a = false } -> 0
| { a = true; b = None } -> 1
| { a = _; b = _ } when guard () -> 2
| { a = _; b = Some n } -> n
match x with
| { a = false } -> 0
| { a = true; b = None } -> 1
| _ when guard () -> 2
| { a = _; b = Some n } -> nThe difference between the two programs is that, in the first case, all accesses to In the general case my intuition is that we probably do not want to provide the snapshot semantics. The rule of the game of pattern-matching compilation is to avoid code-size explosion -- if implemented naively, it can be exponential in the source code. The "matrix-based" algorithm used by the OCaml compiler is designed on the idea that we avoid code explosion by introducing code sharing at the expense of redundant reads. We could decide to try to share code for immutable reads and accept code blowups for mutable reads, but (1) I have no idea how to implement this and it could be very very invasive, and (2) this means that we would behave terribly on complex clauses with only mutable fields. Given that I don't think we can provide the snapshot semantics in general, I don't think that providing it in particular cases is an important goal. In fact, I would rather punish side-effects during matching with a |
|
I looked at the two PRs #12555 and #12556 again. They conflict and fix the same issue, and I still think that #12555 (which erases more context information) is the better choice -- if only because it is simpler to implement. I am closing this one (which binds mutable fields earlier) to help move forward. Feel free to keep discussing and/or ask to reopen if you think differently. |
This PR implements the "alternative approach of binding mutable fields" earlier to fix the "incorrect context information" half of the Pattern Matching Bug #7241. This approach was jointly designed with @trefis. See the detailed description of the situation in #12555, which implements another fix for the same issue which I prefer.
This PR is larger than #12555, it contains the following:
lower_bindlogic from the pattern-matching compiler. (I explicitly decided against being more aggressive than the current optimization.)The testsuite is updated at each commit, and looking at its diff for a given commit can be a good way to see the impact of code-generation changes in the commit.