non-exhaustive fix for the Pattern Matching Bug, take two by gasche · Pull Request #12556 · ocaml/ocaml

gasche · 2023-09-14T09:04:35Z

This PR implements the "alternative approach of binding mutable fields" earlier to fix the "incorrect context information" half of the Pattern Matching Bug #7241. This approach was jointly designed with @trefis. See the detailed description of the situation in #12555, which implements another fix for the same issue which I prefer.

This PR is larger than #12555, it contains the following:

a new testsuite file specifically designed to study the impact on generated code of this change
the bugfix itself, which works by special-casing intermediate arguments that have "let strictness" Strict or StrictOpt
refactorings to the pattern-matching compiler to remove some other sources of Strict and StrictOpt bindings, to clarify and simplify the codebase ; without this simplification, I personally find hard to convince myself that "the bugfix itself" actually fixes all instance of the issue.
finally, an optimization to push the "strict bindings" inside the branches in some cases where detect that it can avoid useless reads ; this reuses existing lower_bind logic from the pattern-matching compiler. (I explicitly decided against being more aggressive than the current optimization.)

The testsuite is updated at each commit, and looking at its diff for a given commit can be a good way to see the impact of code-generation changes in the commit.

…ompilation strategy

…gerly

gasche · 2023-09-14T09:09:12Z

As I explain in #12555, these two PRs are alternative approaches to fix the same bug, and I prefer #12555. (I was not planning to submit this one but @smuenzel expressed interest.)

In particular, my intuition is that #12555 could be merged without specific benchmarking, because it does not change the generated code in the most common cases. This one changes the generated code for any usage of records with mutable fields, in a way that could be beneficial, neglectible or negative. I don't have the energy right now to run an experimental evaluation of this.

I would personally be in favor of going with #12555, and then fixing the "partiality information" issue (my fix for that is more complex than the two we are currently discussing, and will require more review time and energy) to fix #7241 for good. Then, if there is still appetite for fine-grained discussions of the code generation strategy for records with mutable fields, we can revisit the present PR and merge it later.

gasche · 2023-09-14T09:19:27Z

In #12555 (comment), @smuenzel expressed himself in favor of the present approach ("strict bindings" rather than "context erasure"):

I think [useless mutable reads are] not as bad as it sounds. The read is not required to be completed until we actually need the data (in this case we don't), so, assuming a normal out-of-order processor, we only consume memory bandwidth, but do not block subsequent operations. We would only be limited once we hit either our maximum memory bandwidth, or the maximum number of reads we can track. (we could also end up using extra registers/stack space, but I think the user would expect that if they have very complex matches)

The "Bind mutable fields earlier" feels a little more principled -- we're examining a "snapshot" of the variables taken at the time the match scrutinee is evaluated. This means it's much easier to reason about the results -- no "Match_failure" is required, and we can ensure that we retain the full power of exhaustiveness analysis, allowing even strangely written code to prevent having additional exception paths.

A note on the second paragraph: I like the idea of the "snapshot" semantics, as discussed earlier (in another monster comment from last June, #7241 (comment) ) -- the idea is that the pattern-matching code behaves as this the value was read in full at some point, and no part were re-evaluated after that. But it is important to note that the present PR only gets us closer to this snapshot semantics in some cases, and does not provide it in general. In particular, consider the two programs below, the first one will behave according to the "snapshot semantics", but the second will not -- it is currently unsound, but even with that other bug fixed it will raise Match_failure.

match x with
| { a = false } -> 0
| { a = true; b = None } -> 1
| { a = _; b = _ } when guard () -> 2
| { a = _; b = Some n } -> n

match x with
| { a = false } -> 0
| { a = true; b = None } -> 1
| _ when guard () -> 2
| { a = _; b = Some n } -> n

The difference between the two programs is that, in the first case, all accesses to b are under the scope of the same binding (when deconstructing the record pattern shared by all clauses), while in the second program the clauses are split in three separate submatrices, and matrices 1 and 3 will both read b independently.

In the general case my intuition is that we probably do not want to provide the snapshot semantics. The rule of the game of pattern-matching compilation is to avoid code-size explosion -- if implemented naively, it can be exponential in the source code. The "matrix-based" algorithm used by the OCaml compiler is designed on the idea that we avoid code explosion by introducing code sharing at the expense of redundant reads. We could decide to try to share code for immutable reads and accept code blowups for mutable reads, but (1) I have no idea how to implement this and it could be very very invasive, and (2) this means that we would behave terribly on complex clauses with only mutable fields.

Given that I don't think we can provide the snapshot semantics in general, I don't think that providing it in particular cases is an important goal. In fact, I would rather punish side-effects during matching with a Match_failure exception than magically saving the day in a weird subset of cases.

gasche · 2023-12-08T21:14:55Z

I looked at the two PRs #12555 and #12556 again. They conflict and fix the same issue, and I still think that #12555 (which erases more context information) is the better choice -- if only because it is simpler to implement. I am closing this one (which binds mutable fields earlier) to help move forward. Feel free to keep discussing and/or ask to reopen if you think differently.

gasche added 8 commits September 14, 2023 10:40

add a testsuite for the upcoming 'strict bindings' pattern-matching c…

b982452

…ompilation strategy

matching: eagerly bind the strict parameters

fbe9ee7

matching: ensure that all toplevel generators bind their arguments ea…

560b7b1

…gerly

matching: pure bindings can be Alias

e371642

matching: tuple of variables can be Alias

ca4660c

[refactoring] matching: args are always Alias

cee82d1

[trivial] matching: move lower_bind higher in the file

d4b5211

matching: optimize strict field bindings with lower_bind

96967e8

This was referenced Sep 14, 2023

non-exhaustive fix for the Pattern Matching Bug #12555

Merged

Pattern matching with mutable and lazy patterns is unsound #7241

Closed

gasche closed this Dec 8, 2023

gasche mentioned this pull request Jun 14, 2024

The Pattern-Matching Bug: fix totality information #13152

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

non-exhaustive fix for the Pattern Matching Bug, take two#12556

non-exhaustive fix for the Pattern Matching Bug, take two#12556
gasche wants to merge 8 commits intoocaml:trunkfrom
gasche:matching-bug-strict-bindings-pr

gasche commented Sep 14, 2023 •

edited

Loading

Uh oh!

gasche commented Sep 14, 2023

Uh oh!

gasche commented Sep 14, 2023 •

edited

Loading

Uh oh!

gasche commented Dec 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gasche commented Sep 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gasche commented Sep 14, 2023

Uh oh!

gasche commented Sep 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gasche commented Dec 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gasche commented Sep 14, 2023 •

edited

Loading

gasche commented Sep 14, 2023 •

edited

Loading