Skip to content

non-exhaustive fix for the Pattern Matching Bug#12555

Merged
gasche merged 3 commits intoocaml:trunkfrom
gasche:matching-bug-2-fix-contexts
Apr 5, 2024
Merged

non-exhaustive fix for the Pattern Matching Bug#12555
gasche merged 3 commits intoocaml:trunkfrom
gasche:matching-bug-2-fix-contexts

Conversation

@gasche
Copy link
Copy Markdown
Member

@gasche gasche commented Sep 13, 2023

The infamous Pattern Matching Bug #7241 is a conjunction of two issues, two things that go wrong in the pattern-matching compiler when the scrutinee is mutated during matching:

  1. Context information can become incorrect.
  2. Totality information can become incorrect.

The present PR fixes issue 1: context information is fixed, and some instance of the bug are thus fixed. Issue 2 remains, so other instances of the bug remain unfixed.

People who are already familiar with the pattern-matching can review
the PR right away, commit by commit (there are only two).
For the others, the rest of the present PR description explains the
issue. It also contains a comparison with an alternative fix that
@trefis and myself wrote together -- I ended up preferring the
approach in the present PR.

A repro case

Consider the following program:

type t = { a : bool; mutable b : int option }

let f guard x = match x with
| { a = false; b = _ } -> 0
| { a = true; b = None } -> 1
| { a = _; b = _ } when guard () -> 2
| { a = _; b = Some n } -> n

Primer on compilation with pattern matrices

The pattern-matching compiler will see the match x with clauses as the following pattern matrix:

args: (x.(0)   x.(1))
clauses:
      (false   _)               -> 0
      (true    None)            -> 1
      (_       _) when guard () -> 2
      (_       Some n)          -> n

The general compilation strategy is to "switch" on the arguments of the matrix one after another, from left to right. However, in this example the first column cannot be turned into a single switch on x.(0) (with cases for true and false) without duplicating cases, and the third and fourth clauses would need to be repeated for both true and false.

When this happens the matrix is "split" into several matrices that are tried in sequence (in source order), stitched together using "static exceptions". The matrix above will be split as follows:

start:
       false   _                    -> 0
       true    None                 -> 1

exit 1:
       _       _      when guard () -> 2

exit 2:
       _       Some n               -> n

Then each of those sub-matrices can either "switch" on its first column or just ignore it, and recursively compile the second column. The final compiled code could be shown as this pseudo-code (try the -drawlambda flag to see the real thing):

static-catch
  let xa = x.(0) in
  switch xa with
  | false -> 0
  | true ->
    let xb = x.(1) in
    begin switch xb with
    | None -> 1
    | _ -> static-raise 1
    end
with
| 1 ->
  if guard () then 2 else static-raise 2
| 2 ->
  let xb' = x.(1) in
  switch xb' with
  | Some -> xb'.(0)

(switch is like a match on the head constructor, does not bind any variable -- fields and constructor arguments are accessed with x.(i) -- and static-catch and static-raise are like "static" exceptions, but the exception names are just number.)

In each sub-matrix, the argument expressions x.(0) and x.(1) are bound on-demand, when the corresponding column is switched on. This explains why x.(0) is bound once in the generated code (it is the leftmost column, switched on first), while x.(1) is bound inside the start submatrix and also in the exit 2 submatrix.

This code is compiled in an unsound way because the pattern-match compiler believes that the very last switch on xb' cannot fail: that necessarily it must be a Some constructor, so switch xb' with Some -> xb'.(0) is optimized into xb'.(0). This belief comes from the fact that we already checked x.(1) against None in the branche 1. But the value observed could have been
Some _ initially and then mutated into None by the guard () function call. The unchecked access to xb'.(0), that is None.(1), would then segfault.

Context information

The pattern-matching compiler maintains "context information", it
tracks some knowledge about the possible values of the sub-values of
the scrutinee. At the point where static-raise 1 is called, we have
matched x.(0) on true and x.(1) did not match None, so we can
deduce that we are in the following context:

args: (x.(0)    x.(1))
ctx:  (true     Some _)

This information can be invalidated by calling guard (), so the
mutable field lookup xb' = x.(1) can observe a different value for x.(1); but the pattern compiler missed this and assume that the last switch was exhaustive.

set_args_erase_mutable

The pattern-matching compiler contains code to prevent incorrect
information caused by side-effects: whenever it "combines" information from several arguments of a pattern into information on the whole pattern, it will "erase" its information on arguments in mutable position. This is the role of the set_args_erase_mutable function called in Context.combine.

For example, at the first point where we call static-raise 1, the context information is

args: (x.(0)    x.(1))
ctx:  (true     Some _)

If we asked for the context "from the point of view of x", that is outside the Bar -> ... branch, we would not get the anser

args: (x)
ctx:  ({ a = true; b = Some _ })

but instead the answer

args: (x)
ctx:  ({ a = true; b = _ })

as information on the field b, which may be invalidated by side-effects, is erased from the combined context when moving from sub-patterns on the fields x.(0) and x.(1) to the toplevel record pattern.

Unfortunately, this erasure occurs "too late", at the scope of the match on the whole record. Inside the match on the two-column submatrix, the context information for the field x.(1) (the second column of the pattern matrices) is preserved, but the access expression x.(1) may be evaluated several times and return different results.

Remark: if we changed the clauses from

| { a = false } -> 0
| { a = true; b = None } -> 1
| { a = _; b = _ } when guard () -> 2
| { a = _; b = Some n } -> n

to the apparently-equivalent

| { a = false } -> 0
| { a = true; b = None } -> 1
| _ when guard () -> 2
| { a = _; b = Some n } -> n

then the third clause _ when guard () is "at toplevel" and
set_args_erase_mutable gets called, providing the correct context to
the fourth clause. (The clause remains compiled incorrectly due to the
orthogonal "incorrect totality information" issue.)

Two approaches

We want to guarantee that context information on a column does not outlive the binding to its argument expression, when the argument expression is mutable.

There are two approaches to ensure this:

  • Bind mutable fields earlier.

    One could change the bindings of mutable fields to happen as soon as
    the record constructor with fields is matched on, instead of binding them on-demand.

    Instead of

    static-catch
      ..
      let xb = x.(1) in switch xb with ...
    with
    ...
    | 2 -> let xb' = x.(1) in switch xb' with ...

    we would have

    let xb = x.(1) in
    static-catch
      ..
      switch xb with ...
    with
    ...
    | 2 -> switch xb with ...

    If we do this, the binding of x.(1) scopes for as long as the context information about it. The context information on x.(1) is erased by set_args_erase_mutable at the toplevel of the whole code fragment, that is exactly when exiting the scope of the let xb = x.(1) binding.

    This is the approach that @trefis and myself proposed in Pattern matching with mutable and lazy patterns is unsound #7241 (comment).

  • Erase contexts more often.

    Instead of erasing the context information on the mutable fields when we leave the record-matching clauses, we could erase the context information of each split submatrix, whenever we leave the scope of the corresponding argument binding.

    This is the approach implemented in the current PR.

I have implemented both approaches, and I prefer the second one.

Binding fields earlier

The problem with binding mutable fields earlier is that we may end up reading mutable fields that are not necessary. Consider the following variant:

match x with
| {a = true; _} -> 0
| {a = false; None} -> -1
| {a = false; Some n} -> n

Binding fields earlier would give code looking like the following

let xb = x.(1) in
let xa = x.(0) in
switch xa with
| true -> 0
| false ->
  switch xb with
  | None -> -1
  | Some -> xb.(0)

Notice that in the xa = true case, we have performed an unnecessary read of x.(1) before returning 0.

Changing the pattern-matching compiler to bind mutable fields earlier than today (reduces code size and) risks adding useless mutable field reads in existing code paths.
I find it difficult to predict the performance at impact of the change, and so I would rather avoid it. (On the positive side: fields are close to each other in memory, so accessing extra fields is cheap, and we don't have a read barrier, even on mutable reads.)

In particular, note that even clauses that are not split in several submatrices -- as in the variant just above -- pay through extra bindings in some cases.

Erasing context information

Erasing context information means that we could be unable to prove that a given incomplete switch cannot fail. In particular the second exit of our running example:

  | 2 ->
    let xb' = x.(1) in
    switch xb' with
    | Some -> xb'.(0)

would become compiled as follows:

  | 2 ->
    let xb_3 = x.(1) in
    switch xb_3 with
    | Some -> xb_3.(1)
    | None -> raise (Match_failure ...)

which adds an extra test to account for the possibility of mutation between field accesses.

With this approach, the overall code-generation strategy of the pattern-matching compiler is unchanged -- the binding of arguments is delayed as late as possible, as intentionally designed -- but context information is weakened in some places that involve split submatrices. In particular, if there are not split submatrices (which is the common case), there is no performance penalty for mutable fields.

Getting rid of set_args_erase_mutable

With the "second approach" implemented in this PR, we erase the context information of mutable fields earlier, as soon as their argument bindings gets out of scope. By the time we "combine" all field arguments into a single record pattern, they have already been erased, and there is no need for a second erasure in set_args_erase_mutable. The second commit in the PR gets rid of the set_args_erase_mutable logic, which simplifies the code. We remove an attempt at avoiding issues with side-effects that proved insufficient, instead of piling more defenses on top of it.

@smuenzel
Copy link
Copy Markdown
Contributor

Notice that in the xa = true case, we have performed an unnecessary read of x.(1) before returning 0.

I think this is not as bad as it sounds. The read is not required to be completed until we actually need the data (in this case we don't), so, assuming a normal out-of-order processor, we only consume memory bandwidth, but do not block subsequent operations. We would only be limited once we hit either our maximum memory bandwidth, or the maximum number of reads we can track. (we could also end up using extra registers/stack space, but I think the user would expect that if they have very complex matches)

The "Bind mutable fields earlier" feels a little more principled -- we're examining a "snapshot" of the variables taken at the time the match scrutinee is evaluated. This means it's much easier to reason about the results -- no "Match_failure" is required, and we can ensure that we retain the full power of exhaustiveness analysis, allowing even strangely written code to prevent having additional exception paths.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 14, 2023

I'll open a PR with the "bind mutable fields earlier" approach so that you can compare. (There is more to say about the comparison but I will quote your (informative!) post and reply there.)

Edit: done, see #12556.

@alainfrisch
Copy link
Copy Markdown
Contributor

@gasche : Thanks for the very detailed and useful context! Does the solution proposed here address the case of mutations only from explicit when guards, or also those possibly happening when (1) forcing lazy fields, or (2) boxing floats extracted from float record (or from "unboxed float arrays").

I find @smuenzel arguments convincing, esp. if we can measure that the overhead of unneeded read is only tiny. With the solution proposed in this PR, if I understand correctly, there is also a potential overhead if we need to read multiple times and add extra tests.

@gasche gasche force-pushed the matching-bug-2-fix-contexts branch from 85f4528 to f61a201 Compare September 14, 2023 08:14
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 14, 2023

The solution proposed fixes one aspect of the Pattern Matching Bug without making assumptions on where the source of mutation comes from, it merely assumes that reading a mutable field several times may return different results. It thus works for guards, lazy patterns, allocations, or multicore races. (There remain bugs due to the "incorrect partiality information", which is orthogonal and will be the topic of subsequent PRs, that also work for all mutation sources.)

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 14, 2023

I find @smuenzel arguments convincing, esp. if we can measure that the overhead of unneeded read is only tiny. With the solution proposed in this PR, if I understand correctly, there is also a potential overhead if we need to read multiple times and add extra tests.

I don't necessarily disagree, but note that:

  • I think that we will get un-needed mutable reads relatively often on records with mutable fields -- whenever at least one clause uses the field, and at least another clauses does not use it. (Caveat: the extra reads only occur if the mutable field is not the first field in the type declaration -- for first fields the generated code is the same as they are matched right away.) On the other hand, the duplicated reads occur relatively less often -- you have to have a "split" in the pattern matrices, which is less common.
  • The duplicated reads are already present in the code currently generated, so (they may be slower than other approaches, and I also wonder about the code-size impact but) they don't cause a performance regression that I would need to worry about.
  • The extra checks have a cost (probably more costly than the extra reads), but again they only occur when the same field is read across splits, which is relatively uncommon.

This being said, we may be splitting hairs here: the difference that we are talking about only concerns records with a mutable field that is not the first field (in particular, it does not make any difference for single-field records). Pattern matching on record with mutable fields is relatively rare. I had a hard time finding a relevant example in the OCaml compiler codebase -- there are many records with several mutable fields, but they are accessed by field instead of deconstructed by pattern-matching.

The only one example that I could locate is the following, which occurs in several <arch>/emit.mlp files:

  Reg.Set.iter
    (function
      | {typ = Val; loc = Reg r} ->
          live_offset := ((r lsl 1) + 1) :: !live_offset
      | {typ = Val; loc = Stack s} as reg ->
          live_offset := slot_offset env s (register_class reg) :: !live_offset
      | {typ = Addr} as r ->
          Misc.fatal_error ("bad GC root " ^ Reg.name r)
      | _ -> ()
    )
    live

Here loc is a mutable field, and the "strict bindings" strategy will read it in all cases, even in the typ = Addr case where it was not necessary. (Obviously on this example there is no problem as the Addr case is not supposed to ever happen.) There is a split in the pattern matrix, but the second matrix is just for the catch-all case, and it will not read loc again. On this example, the "strict bindings" strategy generates slightly worse code than today, but the "context erasure" strategy keeps the code unchanged.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 14, 2023

(One should also keep in mind that the "context erasure" implementation is simpler.)

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 14, 2023

I asked Luc Maranget, who also has the gut feeling that the "erase more often" approach is the better choice. One point that he made is that useless bindings keep memory alive in bytecode, so binding "earlier" may increase the memory footprint. (Native code is unaffected as it uses a liveness analysis.)

Copy link
Copy Markdown
Contributor

@ncik-roberts ncik-roberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My overall thoughts:

  • The approach in this PR seems like it will work.
  • I don't think that the PR actually handles all the cases it claims to; I've left a review comment about a lingering unsoundness.
  • I'm unsure about the performance impact.

On performance: Users might reasonably be concerned that this change will slow down existing pattern matches that involve mutable fields. Some light experimentation shows that indeed some extra switches are inserted. Do you have a sense of how big of an effect this change should have? (Or some characterization of the class of programs affected?)

Comment on lines +3469 to +3532
(* This is somewhat of a hack: we notice that a pattern-matching
argument is mutable (its value can change if evaluated
several times) exactly when it is bound as StrictOpt. Alias
bindings are obviously pure, but Strict bindings are also only
used in the pattern-matching compiler for expressions that give
the same value when evaluated twice.
An alternative would be to track 'mutability of the field'
directly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this hack is actually unsafe at the moment. This program segfaults:

(* An adapted example of unsoundness from #7421. *)
type t = {a: bool; b: int option ref }
let f x =
  match x with
  | {a = false; b = _} -> 0
  | {a = _;     b = { contents = None }} -> 1
  | {a = _;     b = _} when (x.b.contents <- None; false) -> 2
  | {a = true;  b = { contents = Some y }} -> y
;;

let _ = f { a = true; b = { contents = Some 5 } }

The issue appears to be that the get_expr_args_* family of functions (e.g. get_expr_args_record) "forgets" the mutability of its argument. (Observe the ignored second component of the tuple, _mut). I think we ought to track the mutability of the field read in the args field:

  args : (lambda * let_kind) list;

i.e. the above should instead be a three-field record of read : lambda, let_kind : let_kind, and is_mutable_read : bool. And the get_expr_args_* family of functions should propagate is_mutable_read.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have, in fact, implemented such a change in gasche@53e71e9, which is part of the matching-bug-mutable-positions branch that is planned as one of the follow-ups to the present PR as described in the overview comment #7241 (comment) .

In the overview comment I explain that not all cases of unsoundness are fixed by the present PR, some other are due to incorrect totality information, and tracking mutability of arguments is important to fix those extra cases. I don't have the information in brain memory and haven't looked at your example in details yet, so at this point I am not sure whether your example is exactly a case that should be fixed by the "second half" of the bug (incorrect totality information), or if it is a different instance of the "first half" of the bug (incorrect context information) that needs the present PR to be fixed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least I can confirm that my branch matching-bug-fix-totality, which is supposed to contain both "halves" of the fix (and to need a rebase), also fixes the segfault on your example.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling when I wrote this comment was that it is a different instance of the "first half" of the bug, and is unrelated to incorrect totality information. The gasche@53e71e9 commit does not appear to be present as-is in matching-bug-fix-totality yet (you say elsewhere that branch needs a rebase) so I can't confirm yet.

Would you object to adding this as a test case in the matching-bug-fix-totality branch so we can be sure to remember this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you object to adding this as a test case in the matching-bug-fix-totality branch so we can be sure to remember this?

Not at all. In case you are interested, my working repository uses the following test, which is relatively close in spirit but may be oversimplified and miss some aspects of your test:

(* This example exhibits unsoundness due to the totality issue *)

let r = ref (Some 3)

let _ =
  match Some r with
  | Some { contents = None } -> 0
  | _ when (r := None; false) -> 1
  | Some { contents = Some n } -> n
  | None -> 3

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I find your explanation convincing: the segfaulting program in my original comment indeed seems to be "about totality" rather than about wrong contexts. However, I still have a vague concern about the fact that get_expr_args_record, etc., _-ignores the mutability of its argument. It will be easier for me to think about this question in matching-bug-fix-totality post-rebase, so I'll shelve this thought until then.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my perspective, "arguments" in the pattern-matching machinery are either variables (the typical case) or non-variable expressions that are produced by the get_expr_args* functions by projecting out (in some way) another argument that is itself a variable. So for example, match x with { f = { g = ... } } -> ... will have arguments x, then x.f that will in turn be bound to y, then y.g that will be bound to some z.

x.f.g as a complex expression is never used as an argument in the generated code -- but this is not so easy to check by reading the code, it comes from the fact compile_match_nonempty replaces non-variable arguments by variable arguments before calling combine_handlers, which in turn will eventually create the "division" out of the now-variable argument list in do_compile_matching.

So when I write in the comment that an argument is mutable (maybe "impure" would be the better term), defined as "its value can change if evaluated several time", well y.g can change if evaluated several times if g is a mutable field, but that property does not depend on whether f itself was a mutable field. So I do not expect that we need to transitively propagate mutability for this part of the bug, caused by sharing context information across several different evaluations of a same argument expression. (On the other hand, the transitive form of mutability that you have in mind shows up in the totality part, as a criterion for when to distrust the type-checker approximation of totality: we degrade Total into Partial when under a mutable context (transitively).)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how you want to proceed: we could try to prepare all changes at once for merging, so that you can have a detailed complete picture, or we could try to review the changes step by step, and try to merge a PR before moving to followup changes. The latter is less work for me (because I expect reviews to suggest further changes that may require non-trivial rebases of followup changes), but it makes it harder to have a global view. On the other hand personally I think that reviewing bits by bits is easier and more comfortable -- there are less chances of letting things slip by just because it is too large to be looked at properly.

(In either scenario, the precise process I would have in mind is to have you "Approve" of the PR once you are convinced that it is correct and you find the implementation reasonable/maintenable/nice, and then we would look for a maintainer willing to have a second look or to approve on your behalf.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of reviewing in pieces as well. I intend on clicking "Approve" soon — I'm just verifying one or two more things.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And your explanation here covers what I was missing before, I think:

x.f.g as a complex expression is never used as an argument in the generated code -- but this is not so easy to check by reading the code, it comes from the fact compile_match_nonempty replaces non-variable arguments by variable arguments before calling combine_handlers, which in turn will eventually create the "division" out of the now-variable argument list in do_compile_matching.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Apr 1, 2024

@ncik-roberts here is how to confirm that the testcase you propose is unrelated to the present PR and belongs to the totality-information territory. In trunk, activate the debug output for lambda/matching.ml by changing let dbg = false into let dbg = true, and then compile your testcase. I used the following command-line myself:

./runtime/ocamlrun ./ocamlc -nostdlib -I stdlib -dsource -dlambda -dno-locations test_nick.ml -stop-after lambda 2>&1 | less

There is a fair amount of output and it is not easy to work one's way inside it. But here are the relevant parts:

COMPILE:
  SPLIT
    false _ 
    _ {contents=None} 
    _ _ 
    true {contents=Some y} 
  INTO:
    First matrix:
      PM:
        false _ 
    Default matrix 2:
      PM:
        _ {contents=None} 
        _ _ 
    Default matrix 1:
      PM:
        true {contents=Some y} 

The testcase will be split into three matrices, one of the first line, one for the second and third line, one for the fourth line.

    COMPILE:
      MATCH Partial
      PM:
        None 
      Default environment:
        Matrix for 3:
          <_> 
        Matrix for 1:
          <Some y> 
      CTX:
        LEFT <{ _ }> <true> <{ _ }> RIGHT <_> 
      COMPILE:
        empty matrix
        COMBINE (mk_failaction_pos Partial)
          Default environment:
            Matrix for 3:
              <_> 
            Matrix for 1:
              <Some y> 
          FAIL PATTERNS:
            Some _
          POSITIVE JUMPS:
            jump for 3
            LEFT <{ _ }> <true> <{ _ }> RIGHT <Some _> 
      JUMPS:
        jump for 3
        LEFT <{ _ }> <true> <{ _ }> RIGHT <Some _> 
    JUMPS:
      jump for 3
      LEFT <true> <{ _ }> RIGHT <{ _ }> 

This is the match on the None constructor in the { contents = None } part. That match contains a sub-match on the empty matrix where mk_failaction_pos will compute (erroneous) context information RIGHT <Some _> for the failure case ("jump for 3"). This erroneous information is repeated in the first JUMPS block corresponding to the jump summary for the compilation of the empty matrix. But then the second JUMPS block at the end of the quote, with one less level of indentation, correspond to the { contents = None } jump summary, and it has RIGHT <{ _ }>. I believe that what we are seeing here is set_args_erase_mutable in action: as the context moves up, some information is erased (even today in trunk) on mutable fields, and at this point the context information is not incorrect anymore.

But then, later in the output we have:

    COMPILE:
      MATCH Total
      PM:
        Some y 
      CTX:
        LEFT <{ _ }> <true> <{ _ }> RIGHT <_> 
      COMPILE:
        MATCH Total
        PM:
          _ 
        CTX:
          LEFT <Some _> <{ _ }> <true> <{ _ }> RIGHT <_> 
        COMPILE:
          empty matrix
        NO JUMPS
        COMBINE (mk_failaction_pos Total)
          Default environment:
            empty
          FAIL PATTERNS:
            None
          POSITIVE JUMPS:
            empty
      NO JUMPS

This is the compilation of the Some y submatrix that is responsible for the segfault in your example -- it generates a direct access to the field, convinced that only the Some constructor is possible here, when the input is in fact None. The bug here is that mk_failaction_pos found a "fail pattern" None as it should (an input that is not matched in this switch), but decided to ignore this fail pattern due to the default environment being empty, and so computed no positive jumps (Positive jumps: empty; No jumps) to a failure case. (The default environment is empty because this matching is erroneously believed to be Total, but the relation between those two things is decided much earlier in the caller.)

The bulk of the code to fix this issue is in the matching-bug-final-exit branch mentioned in the overview comment, specifically gasche@f2044ed#diff-059d6544bdc63e5f278818ea9f50bef7aa3f855c4d5ce6fd726abee18b187f55R2813 (but that change is against the current-in-trunk version of mk_failaction_pos, so it would be more pleasant to review, I hope, after we manage to merge #12534 .)

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Apr 1, 2024

On performance: Users might reasonably be concerned that this change will slow down existing pattern matches that involve mutable fields. Some light experimentation shows that indeed some extra switches are inserted. Do you have a sense of how big of an effect this change should have? (Or some characterization of the class of programs affected?)

A program generates different code under this PR when several matches on a given mutable position of the scrutinee occur across several split submatrices in the matching problem, and we need to reason about the former matches to convince ourselves that some of the later matches are exhaustive -- that at this point we have covered all cases.

(I don't know how to give an intuition for how matrices are split without going in the details of how pattern-matching compilation work, so this is not a proper language-level specification.)

My personal intuition is two-fold:

  1. These cases where code is pessimized are going to be very rare.
  2. The code we currently generate for these cases is unsound for OCaml 5 anyway, as they are always cases where a race on the mutable position could invalidate the assumption currently made by the code generator.

(Some other changes down in the tree of branches have a larger potential for performance regression on real-world program, I suspect, because they degrade totality information which is used to generate good code for GADTs in particular.)

@ncik-roberts
Copy link
Copy Markdown
Contributor

These cases where code is pessimized are going to be very rare.

Some experimentation on a large code base appears to agree with this claim for the present PR.

Copy link
Copy Markdown
Contributor

@ncik-roberts ncik-roberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes sense to me and looks good. I'm also convinced it won't pessimize existing code.

One surprising thing is that the type-checker can now warn that a case is unused, but it actually can be reached:

type t = {a: bool; mutable b: int option }

let f x =
  match x with
  | {a = false; b = _} -> 0
  | {a = _;     b = None} -> 1
  | {a = _;     b = _} when (x.b <- None; false) -> 2
  | {a = true;  b = Some y } -> y
  | _ -> 100
;;
(* 9 |   | _ -> 100
        ^
Warning 11 [redundant-case]: this match case is unused. *)
let () = Printf.printf "%d\n" (f { a = true; b = Some 5 })

This program produces the output "100", contrary to the warning. I'm not sure the right way to address this, but it is arguably outside the scope of the current PR.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Apr 5, 2024

Thanks!

To summarize my understanding:

  • After reviewing the change carefully, @ncik-roberts was wondering if this PR missed some "bad cases". After discussion, we agree that the bad cases he had in mind are outside the scope of this PR, and are fixed by a following planned change (no PR submitted yet, but the code is in my branches).
  • @ncik-roberts was wondering about the performance impact, and confirms my intuition that very little real-world code is affected by the change -- he did not find any instance when experimenting with the PR on internal codebases.

Now that @ncik-roberts has approved the PR, I wonder if a maintainer would be willing to approve as well, either by doing their own review or by approving on @ncik-roberts' behalf. (Maybe @Octachron or @lthls would be interested?) I would be interested in having a maintainer look at this relatively quickly, so that I can submit follow-up PRs as soon as possible and strike while the iron is hot.

Copy link
Copy Markdown
Contributor

@lpw25 lpw25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving on @ncik-roberts behalf

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Apr 5, 2024

Thanks! I now have to rebase, merge, and send a follow-up PR.

gasche added 3 commits April 5, 2024 13:43
The infamous matching bug ocaml#7241 corresponds to two distinct issues
that arise when side-effects mutate the value being matched on:
1. Incorrect contexts: context information from the pre-mutation
   switches is retained.
2. Incorrect partiality information: type-checker-provided partiality
   information becomes incorrect.

This commit fixes the first problem: incorrect contexts.
The previous commit erases directly the columns of mutable
arguments. Those columns have now all been erased by the time
Context.combine is called, so we do not need another round of erasure
in Context.combine anymore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants