non-exhaustive fix for the Pattern Matching Bug#12555
Conversation
I think this is not as bad as it sounds. The read is not required to be completed until we actually need the data (in this case we don't), so, assuming a normal out-of-order processor, we only consume memory bandwidth, but do not block subsequent operations. We would only be limited once we hit either our maximum memory bandwidth, or the maximum number of reads we can track. (we could also end up using extra registers/stack space, but I think the user would expect that if they have very complex matches) The "Bind mutable fields earlier" feels a little more principled -- we're examining a "snapshot" of the variables taken at the time the match scrutinee is evaluated. This means it's much easier to reason about the results -- no "Match_failure" is required, and we can ensure that we retain the full power of exhaustiveness analysis, allowing even strangely written code to prevent having additional exception paths. |
|
I'll open a PR with the "bind mutable fields earlier" approach so that you can compare. (There is more to say about the comparison but I will quote your (informative!) post and reply there.) Edit: done, see #12556. |
|
@gasche : Thanks for the very detailed and useful context! Does the solution proposed here address the case of mutations only from explicit I find @smuenzel arguments convincing, esp. if we can measure that the overhead of unneeded read is only tiny. With the solution proposed in this PR, if I understand correctly, there is also a potential overhead if we need to read multiple times and add extra tests. |
85f4528 to
f61a201
Compare
|
The solution proposed fixes one aspect of the Pattern Matching Bug without making assumptions on where the source of mutation comes from, it merely assumes that reading a mutable field several times may return different results. It thus works for guards, lazy patterns, allocations, or multicore races. (There remain bugs due to the "incorrect partiality information", which is orthogonal and will be the topic of subsequent PRs, that also work for all mutation sources.) |
I don't necessarily disagree, but note that:
This being said, we may be splitting hairs here: the difference that we are talking about only concerns records with a mutable field that is not the first field (in particular, it does not make any difference for single-field records). Pattern matching on record with mutable fields is relatively rare. I had a hard time finding a relevant example in the OCaml compiler codebase -- there are many records with several mutable fields, but they are accessed by field instead of deconstructed by pattern-matching. The only one example that I could locate is the following, which occurs in several Reg.Set.iter
(function
| {typ = Val; loc = Reg r} ->
live_offset := ((r lsl 1) + 1) :: !live_offset
| {typ = Val; loc = Stack s} as reg ->
live_offset := slot_offset env s (register_class reg) :: !live_offset
| {typ = Addr} as r ->
Misc.fatal_error ("bad GC root " ^ Reg.name r)
| _ -> ()
)
liveHere |
|
(One should also keep in mind that the "context erasure" implementation is simpler.) |
|
I asked Luc Maranget, who also has the gut feeling that the "erase more often" approach is the better choice. One point that he made is that useless bindings keep memory alive in bytecode, so binding "earlier" may increase the memory footprint. (Native code is unaffected as it uses a liveness analysis.) |
ncik-roberts
left a comment
There was a problem hiding this comment.
My overall thoughts:
- The approach in this PR seems like it will work.
- I don't think that the PR actually handles all the cases it claims to; I've left a review comment about a lingering unsoundness.
- I'm unsure about the performance impact.
On performance: Users might reasonably be concerned that this change will slow down existing pattern matches that involve mutable fields. Some light experimentation shows that indeed some extra switches are inserted. Do you have a sense of how big of an effect this change should have? (Or some characterization of the class of programs affected?)
| (* This is somewhat of a hack: we notice that a pattern-matching | ||
| argument is mutable (its value can change if evaluated | ||
| several times) exactly when it is bound as StrictOpt. Alias | ||
| bindings are obviously pure, but Strict bindings are also only | ||
| used in the pattern-matching compiler for expressions that give | ||
| the same value when evaluated twice. | ||
| An alternative would be to track 'mutability of the field' | ||
| directly. |
There was a problem hiding this comment.
I think this hack is actually unsafe at the moment. This program segfaults:
(* An adapted example of unsoundness from #7421. *)
type t = {a: bool; b: int option ref }
let f x =
match x with
| {a = false; b = _} -> 0
| {a = _; b = { contents = None }} -> 1
| {a = _; b = _} when (x.b.contents <- None; false) -> 2
| {a = true; b = { contents = Some y }} -> y
;;
let _ = f { a = true; b = { contents = Some 5 } }The issue appears to be that the get_expr_args_* family of functions (e.g. get_expr_args_record) "forgets" the mutability of its argument. (Observe the ignored second component of the tuple, _mut). I think we ought to track the mutability of the field read in the args field:
args : (lambda * let_kind) list;i.e. the above should instead be a three-field record of read : lambda, let_kind : let_kind, and is_mutable_read : bool. And the get_expr_args_* family of functions should propagate is_mutable_read.
There was a problem hiding this comment.
I have, in fact, implemented such a change in gasche@53e71e9, which is part of the matching-bug-mutable-positions branch that is planned as one of the follow-ups to the present PR as described in the overview comment #7241 (comment) .
In the overview comment I explain that not all cases of unsoundness are fixed by the present PR, some other are due to incorrect totality information, and tracking mutability of arguments is important to fix those extra cases. I don't have the information in brain memory and haven't looked at your example in details yet, so at this point I am not sure whether your example is exactly a case that should be fixed by the "second half" of the bug (incorrect totality information), or if it is a different instance of the "first half" of the bug (incorrect context information) that needs the present PR to be fixed.
There was a problem hiding this comment.
At least I can confirm that my branch matching-bug-fix-totality, which is supposed to contain both "halves" of the fix (and to need a rebase), also fixes the segfault on your example.
There was a problem hiding this comment.
My feeling when I wrote this comment was that it is a different instance of the "first half" of the bug, and is unrelated to incorrect totality information. The gasche@53e71e9 commit does not appear to be present as-is in matching-bug-fix-totality yet (you say elsewhere that branch needs a rebase) so I can't confirm yet.
Would you object to adding this as a test case in the matching-bug-fix-totality branch so we can be sure to remember this?
There was a problem hiding this comment.
Would you object to adding this as a test case in the matching-bug-fix-totality branch so we can be sure to remember this?
Not at all. In case you are interested, my working repository uses the following test, which is relatively close in spirit but may be oversimplified and miss some aspects of your test:
(* This example exhibits unsoundness due to the totality issue *)
let r = ref (Some 3)
let _ =
match Some r with
| Some { contents = None } -> 0
| _ when (r := None; false) -> 1
| Some { contents = Some n } -> n
| None -> 3There was a problem hiding this comment.
Thanks. I find your explanation convincing: the segfaulting program in my original comment indeed seems to be "about totality" rather than about wrong contexts. However, I still have a vague concern about the fact that get_expr_args_record, etc., _-ignores the mutability of its argument. It will be easier for me to think about this question in matching-bug-fix-totality post-rebase, so I'll shelve this thought until then.
There was a problem hiding this comment.
In my perspective, "arguments" in the pattern-matching machinery are either variables (the typical case) or non-variable expressions that are produced by the get_expr_args* functions by projecting out (in some way) another argument that is itself a variable. So for example, match x with { f = { g = ... } } -> ... will have arguments x, then x.f that will in turn be bound to y, then y.g that will be bound to some z.
x.f.g as a complex expression is never used as an argument in the generated code -- but this is not so easy to check by reading the code, it comes from the fact compile_match_nonempty replaces non-variable arguments by variable arguments before calling combine_handlers, which in turn will eventually create the "division" out of the now-variable argument list in do_compile_matching.
So when I write in the comment that an argument is mutable (maybe "impure" would be the better term), defined as "its value can change if evaluated several time", well y.g can change if evaluated several times if g is a mutable field, but that property does not depend on whether f itself was a mutable field. So I do not expect that we need to transitively propagate mutability for this part of the bug, caused by sharing context information across several different evaluations of a same argument expression. (On the other hand, the transitive form of mutability that you have in mind shows up in the totality part, as a criterion for when to distrust the type-checker approximation of totality: we degrade Total into Partial when under a mutable context (transitively).)
There was a problem hiding this comment.
I don't know how you want to proceed: we could try to prepare all changes at once for merging, so that you can have a detailed complete picture, or we could try to review the changes step by step, and try to merge a PR before moving to followup changes. The latter is less work for me (because I expect reviews to suggest further changes that may require non-trivial rebases of followup changes), but it makes it harder to have a global view. On the other hand personally I think that reviewing bits by bits is easier and more comfortable -- there are less chances of letting things slip by just because it is too large to be looked at properly.
(In either scenario, the precise process I would have in mind is to have you "Approve" of the PR once you are convinced that it is correct and you find the implementation reasonable/maintenable/nice, and then we would look for a maintainer willing to have a second look or to approve on your behalf.)
There was a problem hiding this comment.
I'm in favor of reviewing in pieces as well. I intend on clicking "Approve" soon — I'm just verifying one or two more things.
There was a problem hiding this comment.
And your explanation here covers what I was missing before, I think:
x.f.g as a complex expression is never used as an argument in the generated code -- but this is not so easy to check by reading the code, it comes from the fact compile_match_nonempty replaces non-variable arguments by variable arguments before calling combine_handlers, which in turn will eventually create the "division" out of the now-variable argument list in do_compile_matching.
|
@ncik-roberts here is how to confirm that the testcase you propose is unrelated to the present PR and belongs to the totality-information territory. In trunk, activate the debug output for lambda/matching.ml by changing ./runtime/ocamlrun ./ocamlc -nostdlib -I stdlib -dsource -dlambda -dno-locations test_nick.ml -stop-after lambda 2>&1 | lessThere is a fair amount of output and it is not easy to work one's way inside it. But here are the relevant parts: The testcase will be split into three matrices, one of the first line, one for the second and third line, one for the fourth line. This is the match on the But then, later in the output we have: This is the compilation of the The bulk of the code to fix this issue is in the |
A program generates different code under this PR when several matches on a given mutable position of the scrutinee occur across several split submatrices in the matching problem, and we need to reason about the former matches to convince ourselves that some of the later matches are exhaustive -- that at this point we have covered all cases. (I don't know how to give an intuition for how matrices are split without going in the details of how pattern-matching compilation work, so this is not a proper language-level specification.) My personal intuition is two-fold:
(Some other changes down in the tree of branches have a larger potential for performance regression on real-world program, I suspect, because they degrade totality information which is used to generate good code for GADTs in particular.) |
Some experimentation on a large code base appears to agree with this claim for the present PR. |
ncik-roberts
left a comment
There was a problem hiding this comment.
This PR makes sense to me and looks good. I'm also convinced it won't pessimize existing code.
One surprising thing is that the type-checker can now warn that a case is unused, but it actually can be reached:
type t = {a: bool; mutable b: int option }
let f x =
match x with
| {a = false; b = _} -> 0
| {a = _; b = None} -> 1
| {a = _; b = _} when (x.b <- None; false) -> 2
| {a = true; b = Some y } -> y
| _ -> 100
;;
(* 9 | | _ -> 100
^
Warning 11 [redundant-case]: this match case is unused. *)
let () = Printf.printf "%d\n" (f { a = true; b = Some 5 })This program produces the output "100", contrary to the warning. I'm not sure the right way to address this, but it is arguably outside the scope of the current PR.
|
Thanks! To summarize my understanding:
Now that @ncik-roberts has approved the PR, I wonder if a maintainer would be willing to approve as well, either by doing their own review or by approving on @ncik-roberts' behalf. (Maybe @Octachron or @lthls would be interested?) I would be interested in having a maintainer look at this relatively quickly, so that I can submit follow-up PRs as soon as possible and strike while the iron is hot. |
lpw25
left a comment
There was a problem hiding this comment.
Approving on @ncik-roberts behalf
|
Thanks! I now have to rebase, merge, and send a follow-up PR. |
The infamous matching bug ocaml#7241 corresponds to two distinct issues that arise when side-effects mutate the value being matched on: 1. Incorrect contexts: context information from the pre-mutation switches is retained. 2. Incorrect partiality information: type-checker-provided partiality information becomes incorrect. This commit fixes the first problem: incorrect contexts.
The previous commit erases directly the columns of mutable arguments. Those columns have now all been erased by the time Context.combine is called, so we do not need another round of erasure in Context.combine anymore.
f61a201 to
5432ad1
Compare
The infamous Pattern Matching Bug #7241 is a conjunction of two issues, two things that go wrong in the pattern-matching compiler when the scrutinee is mutated during matching:
The present PR fixes issue 1: context information is fixed, and some instance of the bug are thus fixed. Issue 2 remains, so other instances of the bug remain unfixed.
People who are already familiar with the pattern-matching can review
the PR right away, commit by commit (there are only two).
For the others, the rest of the present PR description explains the
issue. It also contains a comparison with an alternative fix that
@trefis and myself wrote together -- I ended up preferring the
approach in the present PR.
A repro case
Consider the following program:
Primer on compilation with pattern matrices
The pattern-matching compiler will see the
match x withclauses as the following pattern matrix:The general compilation strategy is to "switch" on the arguments of the matrix one after another, from left to right. However, in this example the first column cannot be turned into a single switch on
x.(0)(with cases fortrueandfalse) without duplicating cases, and the third and fourth clauses would need to be repeated for bothtrueandfalse.When this happens the matrix is "split" into several matrices that are tried in sequence (in source order), stitched together using "static exceptions". The matrix above will be split as follows:
Then each of those sub-matrices can either "switch" on its first column or just ignore it, and recursively compile the second column. The final compiled code could be shown as this pseudo-code (try the
-drawlambdaflag to see the real thing):(
switchis like a match on the head constructor, does not bind any variable -- fields and constructor arguments are accessed withx.(i)-- andstatic-catchandstatic-raiseare like "static" exceptions, but the exception names are just number.)In each sub-matrix, the argument expressions
x.(0)andx.(1)are bound on-demand, when the corresponding column is switched on. This explains whyx.(0)is bound once in the generated code (it is the leftmost column, switched on first), whilex.(1)is bound inside thestartsubmatrix and also in theexit 2submatrix.This code is compiled in an unsound way because the pattern-match compiler believes that the very last switch on
xb'cannot fail: that necessarily it must be aSomeconstructor, soswitch xb' with Some -> xb'.(0)is optimized intoxb'.(0). This belief comes from the fact that we already checkedx.(1)againstNonein the branche1. But the value observed could have beenSome _initially and then mutated intoNoneby theguard ()function call. The unchecked access toxb'.(0), that isNone.(1), would then segfault.Context information
The pattern-matching compiler maintains "context information", it
tracks some knowledge about the possible values of the sub-values of
the scrutinee. At the point where
static-raise 1is called, we havematched
x.(0)ontrueandx.(1)did not matchNone, so we candeduce that we are in the following context:
This information can be invalidated by calling
guard (), so themutable field lookup
xb' = x.(1)can observe a different value forx.(1); but the pattern compiler missed this and assume that the last switch was exhaustive.set_args_erase_mutable
The pattern-matching compiler contains code to prevent incorrect
information caused by side-effects: whenever it "combines" information from several arguments of a pattern into information on the whole pattern, it will "erase" its information on arguments in mutable position. This is the role of the
set_args_erase_mutablefunction called inContext.combine.For example, at the first point where we call
static-raise 1, the context information isIf we asked for the context "from the point of view of
x", that is outside theBar -> ...branch, we would not get the anserbut instead the answer
as information on the field
b, which may be invalidated by side-effects, is erased from the combined context when moving from sub-patterns on the fieldsx.(0)andx.(1)to the toplevel record pattern.Unfortunately, this erasure occurs "too late", at the scope of the match on the whole record. Inside the match on the two-column submatrix, the context information for the field
x.(1)(the second column of the pattern matrices) is preserved, but the access expressionx.(1)may be evaluated several times and return different results.Remark: if we changed the clauses from
to the apparently-equivalent
then the third clause
_ when guard ()is "at toplevel" andset_args_erase_mutablegets called, providing the correct context tothe fourth clause. (The clause remains compiled incorrectly due to the
orthogonal "incorrect totality information" issue.)
Two approaches
We want to guarantee that context information on a column does not outlive the binding to its argument expression, when the argument expression is mutable.
There are two approaches to ensure this:
Bind mutable fields earlier.
One could change the bindings of mutable fields to happen as soon as
the record constructor with fields is matched on, instead of binding them on-demand.
Instead of
we would have
If we do this, the binding of
x.(1)scopes for as long as the context information about it. The context information onx.(1)is erased byset_args_erase_mutableat the toplevel of the whole code fragment, that is exactly when exiting the scope of thelet xb = x.(1)binding.This is the approach that @trefis and myself proposed in Pattern matching with mutable and lazy patterns is unsound #7241 (comment).
Erase contexts more often.
Instead of erasing the context information on the mutable fields when we leave the record-matching clauses, we could erase the context information of each split submatrix, whenever we leave the scope of the corresponding argument binding.
This is the approach implemented in the current PR.
I have implemented both approaches, and I prefer the second one.
Binding fields earlier
The problem with binding mutable fields earlier is that we may end up reading mutable fields that are not necessary. Consider the following variant:
Binding fields earlier would give code looking like the following
Notice that in the
xa = truecase, we have performed an unnecessary read ofx.(1)before returning0.Changing the pattern-matching compiler to bind mutable fields earlier than today (reduces code size and) risks adding useless mutable field reads in existing code paths.
I find it difficult to predict the performance at impact of the change, and so I would rather avoid it. (On the positive side: fields are close to each other in memory, so accessing extra fields is cheap, and we don't have a read barrier, even on mutable reads.)
In particular, note that even clauses that are not split in several submatrices -- as in the variant just above -- pay through extra bindings in some cases.
Erasing context information
Erasing context information means that we could be unable to prove that a given incomplete switch cannot fail. In particular the second exit of our running example:
would become compiled as follows:
which adds an extra test to account for the possibility of mutation between field accesses.
With this approach, the overall code-generation strategy of the pattern-matching compiler is unchanged -- the binding of arguments is delayed as late as possible, as intentionally designed -- but context information is weakened in some places that involve split submatrices. In particular, if there are not split submatrices (which is the common case), there is no performance penalty for mutable fields.
Getting rid of set_args_erase_mutable
With the "second approach" implemented in this PR, we erase the context information of mutable fields earlier, as soon as their argument bindings gets out of scope. By the time we "combine" all field arguments into a single record pattern, they have already been erased, and there is no need for a second erasure in
set_args_erase_mutable. The second commit in the PR gets rid of theset_args_erase_mutablelogic, which simplifies the code. We remove an attempt at avoiding issues with side-effects that proved insufficient, instead of piling more defenses on top of it.