Skip to content

[matching.ml cleanup] enable the extra division in more cases#8869

Merged
trefis merged 2 commits intoocaml:trunkfrom
trefis:matching-split_no_or-last-row
Aug 13, 2019
Merged

[matching.ml cleanup] enable the extra division in more cases#8869
trefis merged 2 commits intoocaml:trunkfrom
trefis:matching-split_no_or-last-row

Conversation

@trefis
Copy link
Copy Markdown
Contributor

@trefis trefis commented Aug 12, 2019

The commit message explains what's going on.

One optimization in split_no_or is the insertion of a split before
a last matrix line that has only variables. Splitting a matrix there
creates two default environments (instead of one for the non-split
matrix), the first of which often gets specialized away by further
refinement, and the second one jumping directly to the catch-all
case -- this produces better code.

The code would detect this case (all-variable last row) by calling
group_var on all the patterns of the row, but the group_*
functions assume that their input is already simplified, and only the
first column of the row is simplified.

The present commit fixes it by defining a predicates that does not
assume the pattern is simple, and check that "all its heads" are
a variable. In the future we will refactor this code using
Parmatch.Simple_head.t, and the function invocation may change.

Note: two testcases are changed in tests/basic/patmatch_split_no_or,
the first is the actual optimization and the second is just
stamp-related noise.

Apart from that, I also add the comments I told @Octachron I would add in #8861, but was prevented from doing because of @gasche's eagerness to click merge. :)

| Tpat_var _ ->
true
| Tpat_alias (p, _, _) -> heads_are_var p
| Tpat_or (p1, p2, _) -> heads_are_var p1 && heads_are_var p2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to check p2 here? After all, when simplifying Tpat_or, we replace ω | … by ω directly. It seems that we could implicitly apply this rule here and only have heads_are_var p1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me, but given it's @gasche's patch, let's see what he thinks.
(I personally suggested replacing Tpat_or (p1, p2, _) by Tpat_or (p1, p2, None), since in the Some _ case we know the heads are polymorphic variant constructors; but @gasche didn't seem to like that "optimization")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is wise to change the code based on what we know simplification is going to do in the future -- it makes the code harder to understand by adding a dependency on invariants from some other part of the code (and: we discussed disabling this simplification anyway, assuming that it is uncommon as it fires the "unused subpattern" warning). The logic of that particular function is that it is a specialized instance of a generic "collect all the potential heads of this pattern" followed by an implicit List.for_all traversal; in general the heads of p | q are the heads of p and the heads of q, and following this general pattern gives a clear structure to the code.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more in-person grumbling from @Octachron we agreed to change the definition to use ... p1 || ... p2, and I renamed the function (jamais deux sans trois) into omega_like.

@gasche gasche force-pushed the matching-split_no_or-last-row branch 2 times, most recently from dc9fbb8 to 5c5b98c Compare August 13, 2019 13:08
trefis and others added 2 commits August 13, 2019 15:10
One optimization in split_no_or is the insertion of a split before
a last matrix line that has only variables. Splitting a matrix there
creates two default environments (instead of one for the non-split
matrix), the first of which often gets specialized away by further
refinement, and the second one jumping directly to the catch-all
case -- this produces better code.

The code would detect this case (all-variable last row) by calling
`group_var` on all the patterns of the row, but the `group_*`
functions assume that their input is already simplified, and only the
first column of the row is simplified.

The present commit fixes it by defining a predicates that does not
assume the pattern is simple, and check that "all its heads" are
a variable. In the future we will refactor this code using
Parmatch.Simple_head.t, and the function invocation may change.

Note: two testcases are changed in tests/basic/patmatch_split_no_or,
the first is the actual optimization and the second is just
stamp-related noise.
@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Aug 13, 2019

I'm OK with the new name, not convinced by the change to ||.

@trefis trefis force-pushed the matching-split_no_or-last-row branch from 5c5b98c to b08c4e2 Compare August 13, 2019 13:28
@gasche
Copy link
Copy Markdown
Member

gasche commented Aug 13, 2019

The transformation is correct whether we use && or ||, and it applies in more cases if || is used, and we know that all those cases are going to be merged by further optimizations into the same thing (just an omega), and in fact we don't care about the Tpat_or case and we could just have Tpat_or _ -> false to get rid of the whole thing. But for now @Octachron wants || and @trefis wants &&, please make a decision :-)

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Aug 13, 2019

The transformation is correct whether we use && or ||, and it applies in more cases if || is used

I agree with that.

we could just have Tpat_or _ -> false to get rid of the whole thing

I assume you mean "it's not important enough that I care about it anymore".
In which case I agree!

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Aug 13, 2019

And yes, whatever, just put ||.

Copy link
Copy Markdown
Member

@Octachron Octachron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After "grumbling" a bit, I agree that the change is correct (and that the Tpat_or case doesn't really matter (so we can afford to not transform "or"s into "and"s)).

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Aug 13, 2019

Thanks @Octachron! I'll merge once appveyor finishes whatever it's doing.

@trefis trefis merged commit d266a32 into ocaml:trunk Aug 13, 2019
@trefis trefis deleted the matching-split_no_or-last-row branch August 13, 2019 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants