parmatch: always check partiality by trefis · Pull Request #1227 · ocaml/ocaml

trefis · 2017-07-07T12:42:11Z

The partiality of a matching was only being computed when warning 8 is on.

This would be fine if all this function did was warn about the partiality of the match, but it actually returns a Total | Partial result which is then stored in the typedtree.
Currently when the warning is off Partial is always returned, while this seems safe it is inaccurate and might prevent "optimizations" relying on it from triggering, as is the case in #1195 for example.

This PR changes the implementation to always compute the correct information, and do the warning part only when the warning is on.

garrigue · 2017-07-10T03:31:29Z

One reason to disable the partiality check when the warning is off is that the check itself can be costly.
Before turning it off we should check that there is a simple way to avoid this check (for instance by adding a failing catch-all clause at the end, which is equivalent to assuming it is partial without warning).

trefis · 2017-07-10T11:02:24Z

Yes indeed.
IIUC get_mins will reduce any matrix containing a "wildcard row" to just that row.
However if the matrix contains n rows it will require O(n²) calls to le_pats to do that reduction, which might be costly.

We could make it quicker by explicitly looking for a wildcard row at the start of get_mins and reduce then. Such a check would be linear in the number of patterns in the matrix I believe (the worst case being when there is no such "wildcard row"/catch-all clause).
However Luc already considers such an optimization in section 7.2 of this paper before settling for the more generic, though more costly, one.
It could be argued that we should revisit this choice, since we previously had the option of avoiding all the cost of the check and we don't anymore. I'm not too keen on the idea, but someone might have a different/more informed opinion?

trefis · 2017-07-10T11:03:46Z

(N.B. by "revisit this choice" I mean "have both safe guards")

alainfrisch · 2017-07-10T12:53:22Z

One reason to disable the partiality check when the warning is off is that the check itself can be costly.

Is anyone aware of people actually disabling the partiality check (resp. doing so for compilation-time performance)?

gasche

As I mentioned otherwise, I think that the two commits should be squashed because the first one in isolation does not appear to be correct: it may generate an All_clauses_guarded warning even when Fragile_match is not set, while the latter would previously prevent the former.

The proposed code seems more correct than the previous one, because previously we would test whether Partial_match is active and assume that this implies that Fragile_match is active, if I understand correctly.

Why was this the case? I believe that this is related to the "8, used to be 25" comment, and I remember that we discussed this (myself included) recently, but I don't remember what was the discussion and what exactly this means.
Could someone point out to the old discussion and confirm that, based on the current relations between the two warnings, the PR indeed exactly preserves the compiler's behavior?

gasche · 2017-07-15T19:43:35Z

I reviewed the code and I believe that the proposed code is correct (I'm not sure the previous behavior was fully equivalent, see comment above). On the other hand, I find Jacques' performance concern interesting and I would like to have some information on that before considering a merge.

@trefis, could you measure the time of "make world" with the patch applied, versus the time of "make world" without the patch and with all exhaustivity warnings disabled? Is there a noticeable difference? (Note: this uses only bytecode compilation, as native code production would risk drowning out this cost, and this is assuming that the OCaml compiler code is somewhat representative of average code full of pattern matchings -- but it doesn't tell us anything about degenerate cases.)

I would be interested in having the heuristic you describe implemented (not doing any check when there is an all-omega row), but I don't really understand your comment on get_mins (or rather, cursorily looking at the code did not help me locate this function as the place to change). Do I correctly understand that only exhaust_gadt is ever used (the non-_gadt variants are commented out and never used), and that it is the function to change to apply the heuristic? I don't see how it relies on get_mins, how does that work?

trefis · 2017-07-17T08:06:28Z

Thanks for having a look :)

For the "benchmark" you propose I doubt we'd notice much difference on the compiler but I don't mind trying it. Question though: to test compiling the compiler with a compiler containing these patches, do I want to make bootstrap or something of the sort? ( ping @adrien-n who also knows about this sort of things).

Btw, #1195 is not the first bit of code using the partiality information stored in the typedtree: if I am not mistaken bytecomp/matching.ml also relies on this information to do some optimizations (Optimizing pattern matching (I love the anchor name btw) says for example that you can eliminate the branch raising Match_failure if you know the matching is exhaustive).

I wouldn't change exhaust: it does what it's supposed to do, why obfuscate it with unrelated heuristics?
get_mins gets called from check_partial just before we call exhaust. The function is already charged of reducing the matrix (see section 7 of Warnings for pattern matching on pages 26 to 28 for a detailed explanation of what get_mins does).
My previous comment only proposed to make that reduction quicker; the reason being that exhaust is (I believe) going to run fast on a matrix reduced to a single wildcard row, while performing the reduction to such a matrix will be expensive.

garrigue · 2017-07-19T08:14:38Z

Rather than the whole compiler, it would probably be sufficient to test compiling the stdlib, as it contains a lot of GADT code (this is where I saw most of the difference when I changed the exhaustivity check).

trefis · 2017-07-19T13:30:38Z

I compiled the stdlib a few times with trunk and with the this branch.
The results are as expected, i.e. there is no noticeable difference.

I then compiled camlinternalFormat.ml a bunch of times with both compilers, as it contains some GADT matching.
There is a difference here, but I would say it's negligeable.

Of course, there's AFAIK nothing in these test cases which would exhibit the degenerate behavior of parmatch...

stdlib

Warnings 4, 8, 25 disabled (using this patch).

$ make clean && time make -j1

trunk

	run 1	run 2	run 3	run 4
real	0m7.342s	0m7.286s	0m7.435s	0m7.334s
user	0m6.551s	0m6.511s	0m6.662s	0m6.598s
sys	0m0.620s	0m0.581s	0m0.598s	0m0.541s

GPR1227

	run 1	run 2	run 3	run 4
real	0m7.311s	0m7.331s	0m7.294s	0m6.391s
user	0m6.527s	0m6.540s	0m6.527s	0m5.601s
sys	0m0.604s	0m0.609s	0m0.566s	0m0.605s

camlinternalFormat

$ time for i in {1..20}; do ../boot/ocamlrun ../ocamlc -strict-sequence -absname -w +a-4-8-9-25-41-42-44-45-48 -g -warn-error A -bin-annot -nostdlib -safe-string -strict-formats -w Ae-4-8-25 -c camlinternalFormat.ml; done

	trunk	GPR 1227
real	0m31.695s	0m32.383s
user	0m31.213s	0m31.853s
sys	0m0.444s	0m0.485s

trefis · 2017-07-19T14:26:23Z

Btw, I think the second commit actually has no impact at all.
While the first commit does not appear to be correct, I believe it actually is as Location.prerr_warning actually checks that the warning is active before printing anything.

jberdine · 2017-07-26T22:34:23Z

FWIW, I asked internally at facebook and no one admitted to disabling the partiality check. The potential of missed optimization seemed to be considered a much more important "gotcha". So support for this PR on those grounds.

xavierleroy · 2017-07-28T09:47:49Z

@maranget should look at this PR and confirm, but as far as I remember the worst case is exponential in the size of the matching. A workaround that is quadratic is no big deal.

maranget · 2017-08-31T09:51:26Z

@garrigue and @xavierleroy are right: adding a catch all clause at the end of a matching will
reduce the cost from a (potential) exponential to a certain quadratic. Given experiments, It does not seem useful to add code for reducing this to constant cost (or linear, the catch all clause is at the end of a list...). As usual, some future PR may call for this extra cost decrease on an "example of interest".
Let us wait for this.

My real objection against @garrigue suggestion is: if the matching was exhaustive then we'll have a unsused match clause warning. But again, this is a potential problem not a reported one.
A real and complete solution would be a specific option that would disable PM analysis attogether.
Such an option might be useful for compiling generated code.

So I'll (squash and) merge the patch(es).

Co-authored-by: cuihtlauac <cuihtlauac@users.noreply.github.com>

parmatch: always check partiality

951d4da

trefis mentioned this pull request Jul 7, 2017

Merge functions based on partiality rather than Parmatch.irrefutable #1195

Merged

parmatch: only emit All_clauses_guarded warning when warning is active

b04a3fd

gasche approved these changes Jul 15, 2017

View reviewed changes

maranget merged commit d91ea93 into ocaml:trunk Aug 31, 2017

trefis deleted the partial-parmatch branch April 18, 2019 13:49

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

[create-pull-request] automated change (ocaml#1227)

2848f91

Co-authored-by: cuihtlauac <cuihtlauac@users.noreply.github.com>

Conversation

trefis commented Jul 7, 2017

Uh oh!

garrigue commented Jul 10, 2017

Uh oh!

trefis commented Jul 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trefis commented Jul 10, 2017

Uh oh!

alainfrisch commented Jul 10, 2017

Uh oh!

gasche left a comment

Choose a reason for hiding this comment

Uh oh!

gasche commented Jul 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trefis commented Jul 17, 2017

Uh oh!

garrigue commented Jul 19, 2017

Uh oh!

trefis commented Jul 19, 2017

stdlib

trunk

GPR1227

camlinternalFormat

Uh oh!

trefis commented Jul 19, 2017

Uh oh!

jberdine commented Jul 26, 2017

Uh oh!

xavierleroy commented Jul 28, 2017

Uh oh!

maranget commented Aug 31, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

trefis commented Jul 10, 2017 •

edited

Loading

gasche commented Jul 15, 2017 •

edited

Loading