Skip to content

Fix code duplication in pattern matching of polymorphic variants#11893

Merged
lthls merged 7 commits intoocaml:trunkfrom
lthls:dup-handler-fix-1
Jan 19, 2023
Merged

Fix code duplication in pattern matching of polymorphic variants#11893
lthls merged 7 commits intoocaml:trunkfrom
lthls:dup-handler-fix-1

Conversation

@lthls
Copy link
Copy Markdown
Contributor

@lthls lthls commented Jan 14, 2023

Fixes #11887.

I've left in the first commit a different approach, that always uses test sequences instead of switches for polymorphic variants. If all variant constructors are constant, we already force the use of test sequences, although I don't know why this case is special (the difference was introduced in 157c4e5; before that we used test sequences in all cases). So it looked like a simple and reasonable fix to revert to test sequences only in all those cases.
However, I think that with enough care we could create the same problem with regular variants (with lots of constructors), so I went with what I think is the right fix, which is to detect potential duplication in Matching.SArg.make_switch and introduce wrappers.
I think we could now remove the special case in Matching.as_interval_nofail that detects holes and shares the first action in this case, but I haven't done it here (and I haven't checked that it works).

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 14, 2023

We should ask @maranget for the change in switch strategy. @maranget: Vincent is asking about the changes in combine_variant in 157c4e5#diff-574a9ed85084e8419a9fdc29c147bd50cf338c63200992a97d330344062fba94R1736 ( you must remember well it was in 2001 ), where you moved from test sequences to calling the switcher for all polymorphic-variant cases except only-constants. In general calling the switcher should be same-or-better than forcing test sequences, but for polymorphic variants the values are so sparse that I don't see when we could possibly benefit.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 14, 2023

@lthls your "second attempt" commit contains a revert of the first that is mixed with other things, I find this a bit confusing.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 14, 2023

I think that we need @maranget to look at this patch anyway. I am surprised by the proposed fix, because there is already code in matching.ml to distinguish shared and non-shared action, for example

ocaml/lambda/matching.ml

Lines 2227 to 2239 in 4d932a8

(* Add handler, if shared *)
let handle_shared () =
let hs = ref (fun x -> x) in
let handle_shared act =
match act with
| Switch.Single act -> act
| Switch.Shared act ->
let i, h = make_catch_delayed act in
let ohs = !hs in
(hs := fun act -> h (ohs act));
make_exit i
in
(hs, handle_shared)

The fix seems redundant with this existing logic, so I think that are missing something.

@lthls
Copy link
Copy Markdown
Contributor Author

lthls commented Jan 14, 2023

Here is a more detailed trace of what is going on:

  • We enter combine_variant, for the main pattern matching (match s with).
  • We're manipulating closed variant types, so fail ends up as None.
  • split_cases ends up with one non-constant case (Other), and five constant cases (B0, CA, CH, CI, CL).
  • This leads us to the generic handling, where we call call_switcher_variant_constant which itself calls call_switcher.
  • In call_switcher, the parameter int_lambda_list matches each of the five variant hashes to its own, different action.
  • Then we call as_interval_nofail, which detects that there is a hole so makes the first action (the one associated to B0) shared, but keeps the other ones unshared. The result of this function is a list of five singleton intervals.
  • The we jump into Switcher.zyva, getting into switch.ml. We progress towards do_zyva, and call comp_clusters, which finds that B0 should have its own cluster but puts the remaining 4 elements into a single cluster.
  • Then we get into make_clusters, which is in charge of compiling each cluster. The cluster with one element gets compiled directly to the corresponding action, but the second cluster is compiled into a switch, using make_switch (the one in switch.ml).
  • This function creates a new { cases; actions } pair (the new cases is called tbl, and the new actions is called acts). The new cases has 12 entries, corresponding to the constructors CA through CL (which have consecutive hashes), and they all point to 0 (the action for CA) except CH, CI and CL which point to their respective actions. The new actions contains the four (non-shared) actions.
  • These inputs are then eventually passed to Arg.make_switch, jumping back to Matching.SArg.make_switch.
  • This functions then creates a lambda term Lswitch with the twelve constants, and results in nine of these constants being associated with the same term, despite the fact that this term can be arbitrarily complex.

My patch was written on Matching.SArg.make_switch because in all the above code, it is the only place where terms are actually duplicated. But of course there are other ways to fix the issue. For instance, we could compute clusters before calling abstract_shared in zyva, and force the first action of each non-trivial cluster to be shared. Or we could simply make all actions shared, as Simplif is probably going to remove any useless handlers.
And I didn't see share_actions_tree (it is not called in this particular trace), but I think that it wouldn't work here as it doesn't detect physical equality. For the example in #11887 it might work, as the corresponding terms may have a shareable version (in terms of Lambda.make_key), but it would be trivial to tweak the repro to use non-shareable terms.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 14, 2023

Thanks! These walkthrough are very useful.

The new cases has 12 entries, corresponding to the constructors CA through CL (which have consecutive hashes),

So I guess here is an answer to my comment above: it is in fact not true that variants are very sparse. The hashing function is "bad" enough that we do get dense clusters so using the switcher is worth it.

@maranget
Copy link
Copy Markdown
Contributor

maranget commented Jan 17, 2023

I have reviewed the patch and it looks fine to me. As regards @gasche's question:

We should ask @maranget for the change in switch strategy. @maranget: Vincent is asking about the changes in combine_variant in 157c4e5#diff-574a9ed85084e8419a9fdc29c147bd50cf338c63200992a97d330344062fba94R1736 ( you must remember well it was in 2001 ), where you moved from test sequences to calling the switcher for all polymorphic-variant cases except only-constants. In general calling the switcher should be same-or-better than forcing test sequences, but for polymorphic variants the values are so sparse that I don't see when we could possibly benefit.

Frankly, I do not remember and have no idea. Indeed it looks a good idea to call the switcher in all situations. However, if it ain't broken, don't fix it...

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 17, 2023

@maranget could you "approve" the PR if you like the patch? ("Files changed" tab in the top right > "review changes" button in the top right > "Approve".)

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 17, 2023

@lthls I have one request before merging this: please have a comment to the code that explicitly points to this issue / the bug report, to explain why there is this handling here. Without your trace above I don't think that anyone could easily reverse-engineer why it is necessary.

@lthls
Copy link
Copy Markdown
Contributor Author

lthls commented Jan 18, 2023

I've tried to create another similar example that uses GADTs instead of polymorphic variants:

  type keep = private Keep
  type drop = private Drop

  type _ t =
  | C00 : keep t
  | C01 : drop t
  | C02 : drop t
  | C03 : drop t
  | C04 : drop t
  | C05 : drop t
  | C06 : drop t
  | C07 : drop t
  | C08 : drop t
  | C09 : drop t
  | C10 : drop t
  | C11 : drop t
  | C12 : drop t
  | C13 : drop t
  | C14 : drop t
  | C15 : drop t
  | C16 : drop t
  | C17 : drop t
  | C18 : drop t
  | C19 : drop t
  | C20 : keep t
  | C21 : drop t
  | C22 : drop t
  | C23 : drop t
  | C24 : keep t
  | C25 : keep t
  | C26 : drop t
  | C27 : drop t
  | C28 : keep t

  let not_at_toplevel b =
    let f (x: keep t) =
      match x with
      | C00 ->
        begin
          let[@local] handler x = x + 1 in
          if b then handler 0 else handler 1
        end
      | C20 ->
        begin
          let[@local] handler x = x + 2 in
          if b then handler 0 else handler 1
        end
      | C24 ->
        begin
          let[@local] handler x = x + 3 in
          if b then handler 0 else handler 1
        end
      | C25 ->
        begin
          let[@local] handler x = x + 4 in
          if b then handler 0 else handler 1
        end
      | C28 ->
        begin
          let[@local] handler x = x + 5 in
          if b then handler 0 else handler 1
        end
    in
    f

It doesn't trigger the exact same bug (the local optimisation is done after pattern-matching). Instead, it triggers another issue (a continuation used without a handler), which will crash the compiler even without -dcmm-invariants. It appears to be also fixed by this PR, but I don't know why yet.
I'll try to find more time to investigate this, and I think it would be better to wait until I've understood that problem before merging this PR.

@lthls
Copy link
Copy Markdown
Contributor Author

lthls commented Jan 19, 2023

So the problem is that Simplif.simplify_local_functions uses hash tables indexed on Ident.ts (and on Lambda.lambda using physical equality) so it's doing weird things when it encounters duplicated sub-terms.
That explains why this PR fixes the bug (no more duplicated terms).
I've added a version of this code in the testsuite.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 19, 2023

Do I correctly understand that we would have detected the problem more easily if the calls to Hashtbl.add in

ocaml/lambda/simplif.ml

Lines 291 to 307 in 6b96b73

| Lstaticcatch (l1,(i,[]),(Lstaticraise (_j,[]) as l2)) ->
Hashtbl.add subst i ([],simplif ~try_depth l2) ;
simplif ~try_depth l1
| Lstaticcatch (l1,(i,xs),l2) ->
let {count; max_depth} = get_exit i in
if count = 0 then
(* Discard staticcatch: not matching exit *)
simplif ~try_depth l1
else if
count = 1 && max_depth <= try_depth then begin
(* Inline handler if there is a single occurrence and it is not
nested within an inner try..with *)
assert(max_depth = try_depth);
Hashtbl.add subst i (xs,simplif ~try_depth l2);
simplif ~try_depth l1
end else
Lstaticcatch (simplif ~try_depth l1, (i,xs), simplif ~try_depth l2)

were preceded by an assert (not (Hashtbl.mem subst i))? Should we consider doing this change?

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 19, 2023

(We could also be more careful in the get_exit / incr_exit functions of simplify_exits, with a value of type int ref option that is initialized when the exit is bound, checking for binder unicity of the fly.)

@lthls
Copy link
Copy Markdown
Contributor Author

lthls commented Jan 19, 2023

My preference would be to have a Lambda_invariants pass, like the Cmm_invariants one, where we check the generated terms for various potential issues including duplication of binders. This check could then be run (if enabled) on the Lambda code before and after simplification.
Sprinkling assertions in the existing code would have caught some issues, but probably not all of them.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 19, 2023

Fair enough.

@lthls lthls merged commit f611b45 into ocaml:trunk Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicate Ccatch handler label

3 participants