Skip to content

The Pattern-Matching Bug: fix totality information#13152

Merged
gasche merged 6 commits intoocaml:trunkfrom
gasche:matching-bug-fix-totality-3
Aug 22, 2024
Merged

The Pattern-Matching Bug: fix totality information#13152
gasche merged 6 commits intoocaml:trunkfrom
gasche:matching-bug-fix-totality-3

Conversation

@gasche
Copy link
Copy Markdown
Member

@gasche gasche commented May 7, 2024

This PR is part of the #7241 fix; it is the first un-merged PR in the stack and is ripe for review.)

#13138 introduces information on the "transitive mutability" of argument positions in pattern-matching submatrices: an argument is transitively mutable if it is located transitively under a mutable field (from the root of the value).

The present PR uses this information to pessimize the compilation of switches generated by the pattern-matching compiler. All switches that are in a transitively mutable position are assumed to be Partial, even if the type-checker says that they are Total. (If you want, there is a longer explanation as a code comment in the PR itself.)

This change fixes all known remaining instances of the issue describe in #7241 (unsound interaction between pattern-matching and mutation), in particular all known-wrong behaviors in the testsuite.

(TODO: this is missing a Changes entry.)

Degradation in generated code

The code generated by the pattern-matching compiler will be degraded if the following conditions are all met:

  1. The position is transitively mutable.

  2. At the position that we are generating a switch for, a language construction is used for which Total improves the quality of generated code. This can happen if either (A) only a strict subset of possible constructors is handled at in this submatrix, with the rest having been handled before, or (B) all valid constructors are handled by this submatrix, but this relies on checking that GADT equations are unsatisfiable, and the pattern-matching compiler does not know this. (On the other hand, the pattern-matching compiler does know about the set of constructors possible for a non-GADT sum type or for closed polymorphic variants.)

So the affected code looks like one of those:

(* case (A) above *)
let f : bool option ref -> ... = function
| { contents = None } -> ...
| { contents = Some true } -> ...
| _ when guard () -> ...
| { contents = Some false } (* here *) -> ...

type _ gadt = Int : int t | Bool : bool t

(* case (B) above *)
let g : int gadt ref -> ... = function
| { contents = Int } (* here *) -> ...

In the case (A) (example f), the behavior was previously unsound (in particular if we consider concurrent mutations). There are instances of (B) (example g) where the compiler was sound, and now generates slightly worse code -- the function g above is one such example -- by including a test with a Match_failure case.

In a follow-up PR I implement a simple heuristic that re-optimizes the compilation of some GADT matchings in mutable position, so that the function g above is not pessimized anymore.

Copy link
Copy Markdown
Contributor

@ncik-roberts ncik-roberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main question in my head when reviewing this was performance, as the correctness seems clear. I've run the change over a large corpus and only turned up a few cases where extra checks are inserted. Two of these examples are worth discussing. It feels to me that Case 2 might just be a larger example of Case 1.

Case 1: the simplest example I've found that hits the extra case. Can we avoid it?

type 'a t =
  { mutable x : 'a
  ; mutable y: 'a
  }

let both_some t =
  match t with
  | { x = Some x; y = Some y } -> Some { x; y }
  | { x = None; y = _ } | { x = _; y = None } -> None

The compilation prior to this PR:

  (let
    (both_some/270 =
       (function t/272
         (catch
           (let (*match*/278 =o (field_mut 0 t/272))
             (if (isint *match*/278) (exit 1)
               (let (*match*/279 =o (field_mut 1 t/272))
                 (if (isint *match*/279) (exit 1)
                   (makeblock 0
                     (makemutable 0 (field_imm 0 *match*/278)
                       (field_imm 0 *match*/279)))))))
          with (1) 0)))
    ...)

and after:

  (let
    (both_some/273 =
       (function t/275
         (catch
           (let (*match*/281 =o (field_mut 0 t/275))
             (if (isint *match*/281) (exit 2)
               (let (*match*/282 =o (field_mut 1 t/275))
                 (if (isint *match*/282)
                   (let (*match*/284 =o (field_mut 1 t/275))
                     (if (isint *match*/284) (exit 2)
                       (raise
                         (makeblock 0 (global Match_failure/20!)
                           [0: "example2.ml" 7 2]))))
                   (makeblock 0
                     (makemutable 0 (field_imm 0 *match*/281)
                       (field_imm 0 *match*/282)))))))
          with (2) 0)))
    ...)

In the new result, two checks are performed in sequence on field_mut 1 t. It seems that the second check could be elided.

Case 2: sensitivity to order of field matching. In principle, the asymmetry here seems weird.

Example:

type ab = A | B

type t =
  { mutable field1 : ab
  ; mutable field2 : ab
  ; mutable field3 : ab
  }

let rank_of_last_a = function
  | { field3 = A; _ } -> 3
  | { field3 = B; field2 = A; _ } -> 2
  | { field3 = B; field2 = B; field1 = A } -> 1
  | { field3 = B; field2 = B; field1 = B } -> -1
;;

let rank_of_first_a = function
  | { field1 = A; _ } -> 1
  | { field1 = B; field2 = A; _ } -> 2
  | { field1 = B; field2 = B; field3 = A } -> 3
  | { field1 = B; field2 = B; field3 = B } -> -1
;;

These functions look very similar but the compilation is affected differently by this PR.

It's true that their compilation is already a little different prior to this PR, but not in a way that seems important:

(let
    (rank_of_last_a/274 =
       (function param/276 : int
         (let
           (*match*/284 =o (field 0 param/276)
            *match*/285 =o (field 1 param/276)
            *match*/286 =o (field 2 param/276))
           (if *match*/286 (if *match*/285 (if *match*/284 -1 1) 2) 3))))
  ...)
(let
    (rank_of_first_a/277 =
       (function param/279 : int
         (let (*match*/292 =o (field 0 param/279))
           (if *match*/292
             (let (*match*/295 =o (field 1 param/279))
               (if *match*/295
                 (let (*match*/297 =o (field 2 param/279))
                   (if *match*/297 -1 3))
                 2))
             1))))
    ...)

After this PR, rank_of_last_a gets a worse compilation. It first checks whether all three fields are B, and if they are, it re-reads the fields and checks them again! And, if the earlier-checked fields have since changed to A, an exception is raised. This seems suboptimal, as it would have been equally fine to just return -1 in that case.

New compilation for rank_of_last_a

  (let
    (rank_of_last_a/277 =
       (function param/279 : int
         (catch
           (let
             (*match*/287 =o (field_int 0 param/279)
              *match*/288 =o (field_int 1 param/279)
              *match*/289 =o (field_int 2 param/279))
             (if *match*/289
               (if *match*/288
                 (if *match*/287
                   (let (*match*/293 =o (field_int 1 param/279))
                     (if *match*/293
                       (let (*match*/294 =o (field_int 2 param/279))
                         (if *match*/294 -1 (exit 1)))
                       (exit 1)))
                   (let (*match*/291 =o (field_int 1 param/279))
                     (if *match*/291
                       (let (*match*/292 =o (field_int 2 param/279))
                         (if *match*/292 1 (exit 1)))
                       (exit 1))))
                 (let (*match*/290 =o (field_int 2 param/279))
                   (if *match*/290 2 (exit 1))))
               3))
          with (1)
           (raise
             (makeblock 0 (global Match_failure/20!) [0: "example1.ml" 9 21])))))
    ...)

The compilation of rank_of_first_a is unaffected.

The asymmetry between the effect on rank_of_first_a and rank_of_last_a suggests that there might be a way to make rank_of_last_a emerge unscatched from this PR.

Comment on lines +3711 to +3757
a reference is pessimized. (Matching on non-GADT constructors
inside a reference is also arguably pessimized, given that cases
where side-effects occur are exceedingly rare; but we cannot know
that they will not occur.)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remark may appear mystifying given that the only example is of a case where there is a trivial syntactic criterion to observe that there might be side effects (there is a when guard). I believe this remark is about multithreaded mutation; maybe mention that?

(* It is important to distinguish:
- [arg_partial]: the partiality information that will
be used to compile the 'upcoming' switch on the first argument
- [partial:] the partiality information that will be used
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [partial:] the partiality information that will be used
- [partial]: the partiality information that will be used

Comment on lines +2879 to +2905
(* a type of per-argument partiality information used by
[mk_failaction_*] functions to reason statically about which
partiality information is used for these per-argument functions. *)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is easy enough to reason about in the context of this PR, but I feel it leaves out enough connective tissue that future readers will be confused. (Per-argument partiality information as opposed to what? Of course a type helps us reason statically -- what is that remark about?)

It may be more clarifying (and simplify the comment) to instead create a new sum type, type arg_partiality = Arg_partial | Arg_total.

@ncik-roberts
Copy link
Copy Markdown
Contributor

I am not highly confident that this is the right characterization, but it may be helpful anyway: If a pattern does not bind any variables, then there's no need to check that the relevant subfield of the scrutinee still matches that pattern before performing the action.

@ncik-roberts
Copy link
Copy Markdown
Contributor

ncik-roberts commented Jun 13, 2024

I checked that #13154 doesn't de-pessimize either case I describe, though you might want to confirm.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 14, 2024

I'm afraid that I don't have an easy answer to your question. I don't see a feasible, non-invasive way to improve the pattern-matching compiler that would re-optimize these programs and not have other negative consequences.

Below is to explain the situation of this examples in details, and I list a few avenues for improvement that I considered -- I did not found any of them to be convincing.

My impression is that the right choice to do here is probably to favor correctness over (complexity or performance), and accept that those examples now generate slightly worse code.

Understanding what happens

In those two examples, the compiler generates a split during pattern-matching compilation, and the matrices below the split get slightly worse compilation.

First example:

let both_some t =
  match t with
  | { x = Some x; y = Some y } -> Some { x; y }
  | { x = None; y = _ } | { x = _; y = None } -> None

with dbg = true we see the following debug output:

    SPLIT
      Some x Some y 
      None _ 
      _ None 
    INTO:
      First matrix:
        PM:
          Some x Some y 
          None _ 
      Default matrix 3:
        PM:
          _ None 

The split comes from the pattern { x = _; y = None } which cannot be sent to a single branch of the decomposition of x into None or Some -- if we wanted to avoid a split we would have to include it in both submatrices, and this is precisely the sort of potential-code-size-explosion that the compilation scheme tries hard to avoid.

In this case it is a bit unfortunate as we know that in fact this {x = _; y = None } is equivalent to { x = Some _; y = None }, as the x = None case is already handled earlier. If we change the code to use this more precise pattern

let both_some t =
  match t with
  | { x = Some x; y = Some y } -> Some { x; y }
  | { x = Some _; y = None } | { x = None; _ } -> None

then you get the nice compilation you expect, without a split. The conceptual difference is that all cases now start with a head constructor for the first column.

Second example:

let rank_of_first_a = function
  | { field1 = A; _ } -> 1
  | { field1 = B; field2 = A; _ } -> 2
  | { field1 = B; field2 = B; field3 = A } -> 3
  | { field1 = B; field2 = B; field3 = B } -> -1
;;

let rank_of_last_a = function
  | { field3 = A; _ } -> 3
  | { field3 = B; field2 = A; _ } -> 2
  | { field3 = B; field2 = B; field1 = A } -> 1
  | { field3 = B; field2 = B; field1 = B } -> -1
;;

The rank_of_first_a function can be compiled from left to right
without any split.

  PM:
    A _ _ 
    B A _ 
    B B A 
    B B B 

The rank_of_last_a function needs to split on each column, so (because the record fields are mutable) it now generates worse code.

  SPLIT
    _ _ A 
    _ A B 
    A B B 
    B B B 
  INTO:
    First matrix:
      PM:
        _ _ A 
        _ A B 
    Default matrix 2:
      PM:
        A B B 
        B B B 

Re-optimizing the code?

The following ideas come to mind to re-optimize the generated code.

Column compilation order (second example)

The rank_of_last_a example is a textbook example of the fact that it is sometimes useful to choose another column decomposition order than pure left-to-right. Luc Maranget's research on pattern-matching compilation has come up with good heuristics to pick the next column of the matrix do decompose, but OCaml (and I think pattern-matching compilers for other languages as well) uses left-to-right always. This is simpler, and in fact it is important in presence of GADTs, because GADTs can propagate type equalities from left to right in the patterns, and therefore matching on columns in a different order may result in considering incoherent columns, that is, columns with patterns of different, incompatible types.

For this reason, I don't think it is realistic to hope to move from the left-to-right compilation order in the context of this PR / fixing this bug.

Avoiding unecessary splits (first example)

I pointed out that it is possible to manually specialize x = _; y = None into x = Some _; y = None, and get a better compiled output. Could the compiler figure this out by itself? I think that the answer is yes, because the information of which constructors may flow into the wildcard is related to the information computed by the usefulness analysis. When a wildcard may only correspond to one constructor (and not several), refining it into this construction should always improve the code. (Optimizing this might improve other examples that do not contain mutable fields and are not affected by this PR.) But computing this is non-trivial and it may have a negative impact on the performance of the pattern-matching compiler -- we probably don't want to attempt such an optimization on all wildcard patterns that the splitter encounters. I don't see an easy, clearly-a-good idea way to do this.

Binding mutable fields earlier

I think that the compilation of the first example would improve if we were to read mutable record fields eagerly, when matching on their record, instead of binding them only when they come up in the first column. In particular this would avoid the double read on (field_mut 1 t/272) that you noticed in the first example.

This is an approach that we proposed in #12556. I ultimately decided against it because it was a more invasive change to the compilation strategy, that came with the risk of pessimizing other examples by generating un-necessary mutable reads. In particular, in the second example, notice how the compilation of rank_of_first_a performs its mutable reads on-demand, and the value 1 is returned after reading just field1 = A and none of the other fields. With the compilation scheme of #12556 we would loose this good property, the code would start by reading all fields in all cases, just like rank_of_last_a.

Treating these failures as unspecified behavior

With this PR, the pattern-matching compiler downgrades some submatrices from Total to Partial, and it will compile them like any other Partial submatrix, that is, by ensuring that there is a catch-all case that goes to a Match_failure clause.

Another option would be to teach the compiler that (1) yes, the failure case is possible, but (2) we don't particularly care about getting a Match_failure exception here, fusing this with another case would be fine. In essence, we would treat the "the value changed under our feet during matching" as an "unspecified behavior" situation (any type-safe behavior is accepted) rather than a "clean dynamic failure" behavior.

For example, if we look at your first example, a part of the generated -drawlambda output was as follows before this PR:

  with (2) (let (*match*/284 =o (field_mut 1 t/275)) 0)))))

and now it looks like this:

  with (2)
   (let (*match*/284 =o (field_mut 1 t/275))
     (if *match*/284 (exit 1) 0))))

This corresponds to a sub-clause of the form { ... ; y = None } -> None.

Before the pattern-matching compiler would generate code that reads the field y, but then notice from totality information that the only possible constructor at this point is None, and thus elide the check that y is indeed None and return 0 (that is, None) right away.

With this PR, the compiler is aware that the value of y could have changed in the meantime, so it may be something else than None, and therefore it does not elide the check. If the check y = None fails, it goes to (exit 1) which is the Match_failure clause.

Instead we could say: okay, y may be something else than None, but in that case we are fine with Match_failure or something else that makes sense, for example 0, and so you can generate (if *match*/284 0 0), or maybe even just 0.

I have a vague idea of how we could try to do this:

  1. When we compile a submatrix, take note of whether it is empty, in which case it is an "immediate action" that does not rely on any extra condition or variable bound in its patterns.
  2. When we degrade Total due to mutable patterns, use a special degraded mode AlmostTotal instead of Partial.
  3. In mk_failaction_pos, if we need to generate a fail action but we are in AlmostTotal mode, instead of the final exit we can use any other exit that corresponds to an empty submatrix.

However, note that this optimization approach comes at the expense of weaker semantics. With the current approach, users know that if their program is affected by a value being mutated while it is matched, they will get a clean Match_failure so they will notice the issue and investigate. With the proposed "unspecified" semantics, we silently generate better code and users do not know that their program contains what is arguably a programming error.

Aside: a remark on redundant reads

The first example that you give has the frustrating property that there are splits, as you can see in the -drawlambda output, but the exit ... jumping to the split submatrices are inlined away by the "simplification" pass that follows, as can be seen in the -dlambda output.

RAWLAMBDA:
    (both_some/273 =
       (function t/275
         (catch
           (let (*match*/281 =o (field_mut 0 t/275))
             (catch
               (if *match*/281
                 (let
                   (x/276 =a (field_imm 0 *match*/281)
                    *match*/282 =o (field_mut 1 t/275))
                   (if *match*/282
                     (let (y/277 =a (field_imm 0 *match*/282))
                       (makeblock 0 (makemutable 0 x/276 y/277)))
                     (exit 2)))
                 (let (*match*/283 =o (field_mut 1 t/275)) 0))
              with (2)
               (let (*match*/284 =o (field_mut 1 t/275))
                 (if *match*/284 (exit 1) 0))))
          with (1)
           (raise
             (makeblock 0 (global Match_failure/20!) [0: "test1bis.ml" 7 2])))))

LAMBDA:
    (both_some/273 =
       (function t/275
         (let (*match*/281 =o (field_mut 0 t/275))
           (if *match*/281
             (let (*match*/282 =o (field_mut 1 t/275))
               (if *match*/282
                 (makeblock 0
                   (makemutable 0 (field_imm 0 *match*/281)
                     (field_imm 0 *match*/282)))
                 (let (*match*/284 =o (field_mut 1 t/275))
                   (if *match*/284
                     (raise
                       (makeblock 0 (global Match_failure/20!)
                         [0: "test1bis.ml" 7 2]))
                     0))))
             0))))

Notice that the exit (2), corresponding to the second submatrix (which follows a split), gets inlined in the code generated by the first submatrix. As a consequence, the second mutable read on field_mut 1 t, which occurs in the second submatrix, ends up within the lexical scope of the first read, which looks a bit silly.

(The redundant load gets eliminated by CSE (Common Subexpression Elimination) later down in the native-compiler pipeline, but the redundant branch is preserved.)

@ncik-roberts
Copy link
Copy Markdown
Contributor

Thanks for the explanation. That clarifies to me what's going on with these examples. (Yes, indeed, I should have looked at -drawlambda first.)

I agree with your conclusion about the first three alternatives for re-optimization: either it's not clear if there's a viable approach, or there's a clear downside and it's hard to decide the right tradeoff.

Treating these failures as unspecified behavior?

I'm mostly unmoved by the argument about weaker semantics. Whether or not the failure case is inserted is already sensitive to the pattern match compiler's splitting algorithm. I don't think the user can reliably predict these choices — as you note, splitting handles { x = _; y = None } differently than { x = Some _; y = None } even in contexts where the two patterns seem equivalent because { x = None; _ } was already matched.

In other words, I find the existing semantics to be weak already. (To be clear, I'm not complaining about this.) Even the type-checker is confused (I say somewhat facetiously): it doesn't produce a warning for both_some, despite the fact that the result of the pattern match compiler can raise Match_failure.

Here's a perhaps silly approach: If the final case's pattern can be replaced by a catch-all pattern (i.e. the match is total and the final case binds no variables), then do so before invoking the pattern match compiler. I think it does improve the compilation of both of these cases. It may be too limited in utility to add this special case, but it could be worth considering.

@ncik-roberts
Copy link
Copy Markdown
Contributor

ncik-roberts commented Jun 21, 2024

My last suggestion, to replace the last case in a “total” match with a wildcard case, doesn’t work in general because some wildcards can match a value that isn’t matched by the last pattern. (In particular, if the scrutinee is mutated during matching.)

The cases I posted about are tantalizingly close to staying untouched by this PR. But I can’t think of a small variation on the existing approach that would fix them. Such cases seem infrequent &; in a codebase with many millions of lines of code, I only found four affected pattern matches where a Match_failure case is newly inserted.

The apparent infrequency (if you believe my claim) makes me wonder whether a new warning should be issued when pattern matching inserts a Match_failure case but the type-checker insists the match is total. This would let affected users decide that the pattern match should be rewritten to avoid matching on a mutable field.

Do you have an opinion on a warning?

(this would also reduce the concern I posted about earlier, where now a match can raise Match_failure even in cases where the type-checker succeeds without warning and where adding a catchall case would actually raise a warning.)

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 22, 2024

My last suggestion, to replace the last case in a “total” match with a wildcard case, doesn’t work in general because some wildcards can match a value that isn’t matched by the last pattern. (In particular, if the scrutinee is mutated during matching.)

I don't understand, do you have an example?

My intuition is that if a match is known to be Total, and the current submatrix has a single row, then a pattern that does not bind any variable can indeed be replaced by a wildcard. Instead of weakening Total into Partial under mutable positions we could weaken it into a new Almost_total partiality status, and also wildcardize single-row patterns in this case (which would deviate from the current behavior, but be acceptable under the interpretation that the behavior in case of scrutinee mutation is unspecified).

Do you have an opinion on a warning?

My thinking:

  • This should be easy to implement, so I don't mind giving it a try.
  • The warning is not about the (high-level) semantics of the program, but about its performance characteristics. I think that it makes sense if we are convinced that some users will want to catch this pessimization, and try to avoid it, because it brings them performance benefits. Personally I am not really convinced that there would be performance benefits. (There are other ways to compile pattern-matching that result in similar code changes with slight optimizations and pessimizations of the generated control flow, and to my knowledge no one particularly cares about investigating those choices because the performance of pattern-matching is good enough in practice -- I think the current change preserves this property.)
  • In the very rare cases where expert users know they want an in-depth understanding of a pattern-matching performance, they can already inspect the -dlambda output. (I think we could improve on this slightly by turning the pattern-matching debug output into a new -dmatchcomp option, and possibly thinking about intermediate representations to make the compilation code nicer and improve its observability.)
  • So in a sense, the warning is more about managing the transition (letting people discover code whose performance profile might have changed, and decide to investigate) than something that will matter in the long term / for newly written code.
  • I don't think that non-expert users should be exposed to this warning, because it cannot be explained from the source language semantics, you have to teach people about the inner workings of the pattern-matching compiler to explain it.
  • Therefore, I think that the warning should be disabled by default.

To summarize: I agree that the pessimization will be observed very rarely in practice, and furthermore I think that when it does happen it will not degrade the program performance in practice. My intuition is that the proposed change is thus acceptable as a silent change. But if you want to be sure that performance-paranoid users have an easy way to evaluate the impact of the change on their codebase, I am happy to introduce a warning in this case, disabled by default.

@ncik-roberts
Copy link
Copy Markdown
Contributor

Example where my "replace with wildcard" suggestion produces dubious behavior

example below produces a different result if you replace the last pattern (which binds no variables) with _. The result of this rewrite (example_with_wildcard) produces an output, 6, despite the fact that the scrutinee at no point contains C in the mut field of b. This seems clearly undesirable, and I suggest abandoning my "replace with wildcard" suggestion.

type 'a myref = { mutable mut : 'a }
type abc = A | B | C
type t = {a: bool; b: abc myref }

let example () =
  let input = { a = true; b = { mut = A } } in
  match input with
  | {a = false; b = _} -> 1
  | {a = _;     b = { mut = B }} -> 2
  | {a = _;     b = _} when (input.b.mut <- B; false) -> 3
  | {a = true;  b = { mut = A }} -> 4
  | {a = _;     b = _} when (input.b.mut <- A; false) -> 5
  | {a = true;  b = { mut = C }} -> 6
;;

let example_with_wildcard () =
  let input = { a = true; b = { mut = A } } in
  match input with
  | {a = false; b = _} -> 1
  | {a = _;     b = { mut = B }} -> 2
  | {a = _;     b = _} when (input.b.mut <- B; false) -> 3
  | {a = true;  b = { mut = A }} -> 4
  | {a = _;     b = _} when (input.b.mut <- A; false) -> 5
  | _ -> 6
;;

let run f =
  match f () with
  | i -> string_of_int i
  | exception Match_failure _ -> "raised Match_failure"
;;

let () = Printf.printf "example: %s\n" (run example)
let () = Printf.printf "example_with_wildcard: %s\n" (run example_with_wildcard)

It produces the output:

example: raised Match_failure
example_with_wildcard: 6

I also suspect this approach can be unsound if the final variable-less pattern introduces type equations, but I haven't tried to produce an example given that the previous one seems compelling enough.

Clarifying my reason for suggesting for a warning

There are two reasons in my mind for suggesting a warning, but one reason seems much more important to me than the other:

  1. Performance. (As you explain,) the extra checks might be undesirable for users
  2. Semantics. It is surprising for the type-checker's totality checker to accept a program (warning free), and then for the resulting program to be able to raise Match_failure at runtime.

I am more concerned about semantics than performance. Expert users might care about the performance overhead of extra checks, but less expert users will still care about avoiding partial match exceptions.

Does that change how you feel about the warning? I would completely agree with your characterization if the warning was just about performance, but in light of the "semantics" reason, I feel differently about this point:

So in a sense, the warning is more about managing the transition [...] than something that will matter in the long term / for newly written code.

What kind of warning?

I suspect this warning would be mostly straightforward to implement: "Match may be partial if scrutinee is mutated during the match." The criteria: pattern match compilation inserts a Match_failure case in a match the type-checker claimed to be total.

There are at least two problems with such a warning, in my view:

  • No easy local rewrite: It's bothersome that the user can't disable the warning by adding the needed catchall case themselves. The type-checker will reject it with "redundant pattern". (The inability to add the catchall case is bothersome even without a new warning, but the new warning makes it so users will have to figure out how to work around it when it is raised.)
  • Noisy: The warning will raise even if the match doesn't involve any of the troublesome cases: lazy patterns, boxing (and thus the possibility of GC finalizers being run), when guards, or multithreaded mutation. The last of these is a property of the surrounding code, not the match itself, and thus not something that can be detected by the pattern match compiler.

These problems make me feel wary of recommending a warning that is enabled by default.

Fundamentally there is a mismatch between the type-checker's assumptions about pattern matching and the pattern match compiler's actual behavior. A warning would just paper over this difference.

The present PR is a clear improvement over the previous unsoundness, so I'm inclined to say that that any work on the warning could be left until after we merge this PR. But I will wait to hear what you think about the warning, in light of the "semantics" reason I explain above (and also the problems I note).

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 23, 2024

I agree that your dubious example is dubious. Indeed my thinking was that if a clause is taken because it is wildcardized, then we initially started with an input matching this clause before side-effects changed things. Your example demonstrates that this is not necessarily the case. I will add it to the testsuite right away.

Regarding semantics: I don't disagree that suddenly raising Match_failure is nasty. But I would also note that we have let this known soundness bug lie open for around a decade; I think it is because people are convinced that the bug in fact never occurs in the real world. (This perception could have been colored by the difficulty of fixing the bug, which encouraged us to minimize the real-world relevance.) I think that fixing it is the right thing to do, even if it pessimizes slightly a small-but-not-empty set of programs; but the warning has a different cost/benefits analysis.

Our intuition is that the segfault never occurs in practice, and that only a few programs are pessimized: that's infinitely more! Pessimizing very slightly some more programs is fine. But on the other hand, the warning is designed to pester their authors about a Match_failure exception, while we believe that they will never observe it, and we don't really know how they should fix their code.

Fixing the code: In a few cases authors could tweak their code with guidance from a pattern-matching-compilation expert. Otherwise they can decide to catch Match_failure around the match, or just decide that they audited the code and that they don't expect the issue to affect them; and then silence the warning.

My preference goes to adding the warning, disabled by default. The main value of the warning in my opinion is to let people that would worry about the compilation change understand what parts of their codebase are affected -- without the warning, it is fairly hard to do this analysis, you need to use a forked compiler over your codebase.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 23, 2024

Let me write down an idea I had in passing: instead of raising Match_failure, we could jump back to the beginning of the pattern-matching logic. This approach does not introduce more failures, and it is arguably a better behavior in all cases other than when guards (lazy thunks, concurrent races, asynchronous callbacks): the pattern-logic restarts and then has the behavior expected from the source-level semantics. But when guards break the idea, I think, as (1) it could result in effectful when-guards being executed several times, and worse (2) a well-chosen when-guard that makes the value flicker may produce in an infinite loop of restarts, another form of partiality.

@ncik-roberts
Copy link
Copy Markdown
Contributor

Adding the warning, disabled by default, sounds good to me.

I see your argument that the performance change is what motivates the warning. I want to lodge one last argument in favor of mentioning the presence of the extra Match_failure case in the warning text: I know some OCaml users that refuse to use when-guards because they know about the possible unsoundness. So, even if the unsoundness is considered unlikely to happen in practice, it's still something that can factor into users' choices. I expect these users will want to enable this warning in perpetuity, as it will predictably flag when pattern matching can result in a behavior not listed in any of its cases (the Match_failure exception).

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 23, 2024

I started working on a warning, but this is proving more interesting than planned -- I can't get it to fire on a testsuite example which should, and it fires in the stdlib in places where it should not.

@maranget
Copy link
Copy Markdown
Contributor

Hi @gasche, there is something that puzzles me, for instance with Nick's first example:

type 'a t =
  { mutable x : 'a
  ; mutable y: 'a
  }

let both_some t =
  match t with
  | { x = Some x; y = Some y } -> Some { x; y }
  | { x = None; y = _ } | { x = _; y = None } -> None

Is there any reason for the mutable fields to be mutated? As there is no guard nor lazy pattern, do we assume that the mutable fields can be mutated by a concurrent thread, or something else?

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 24, 2024

@maranget yes, in this case mutations would come from concurrent mutations. My general approach is to assume that any mutable field may be mutated "during" matching, so that two reads on the same field may always yield different results.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 30, 2024

I rebased the present PR on top of #13338, which fixes a code-quality regression introduced in #13076.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 30, 2024

I implement and document a warning as discussed in #13341. That PR is on top of #13153, which removes the check_partial machinery (an older, incomplete heuristic to degrade partiality information, that is subsumed by the present PR).

The reason for being on top of #13153 rather than on top of the current PR directly is that the natural/simple implementation of the warning is after the place where the check_partial machinery is currently called, so the check_partial can (wrongly) silence the warning in some cases. In other words, the codebase is cleaner after #13153 and it is (slightly) simpler to implement the warning correctly.

@ncik-roberts
Copy link
Copy Markdown
Contributor

I am happy with this PR after doing some more testing with #13338. I'm interested to see it rebased over 4dd3c6a before I click "approve", but I don't have major comments at this point.

@gasche gasche force-pushed the matching-bug-fix-totality-3 branch from fd1f73d to ca82560 Compare July 31, 2024 17:15
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 31, 2024

I just rebased this PR on top of the (new) second half of #13338, and also there was a small conflict with #13342 (a nice-to-have conflict, the new version is shorter and simpler).

Copy link
Copy Markdown
Contributor

@ncik-roberts ncik-roberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The careful separation between arg_partial and partial seems correct and useful. I'm satisfied that this PR limits its scope to matches on transitively-mutable positions, and that the new warning will help keep interested users from being surprised about performance or semantics. (And, as I mentioned elsewhere, I've tested this change and found indeed that it flags only complex matches on mutable fields — I found only 3 instances in a codebase with millions of lines, and it was possible to rewrite them without much trouble.)

gasche and others added 6 commits July 31, 2024 21:55
… default

This un-does a small pessimization of ocaml#13076, which was analyzed in
details in
ocaml#13076 (comment), but
at the time was believed to not affect any program in the wild.

Since then Nick Roberts found instances of this pattern involving
or-patterns and polymorphic variants.

Thanks to the new presentation of partiality information that
distingues the "global" and "current" information, the pessimization
is now easy to undo, as done in this commit.
Copy link
Copy Markdown
Contributor

@lpw25 lpw25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving on the basis of @ncik-roberts review

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Aug 22, 2024

Thanks! Merging.

@gasche gasche merged commit 0fa46d5 into ocaml:trunk Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants