Skip to content

Matching: document and refactor mk_failaction_pos#12534

Merged
gasche merged 4 commits intoocaml:trunkfrom
gasche:matching-document-mk_failaction
Apr 2, 2024
Merged

Matching: document and refactor mk_failaction_pos#12534
gasche merged 4 commits intoocaml:trunkfrom
gasche:matching-document-mk_failaction

Conversation

@gasche
Copy link
Copy Markdown
Member

@gasche gasche commented Sep 6, 2023

mk_failaction_pos is one of the key operations in compilation of sum constructors. Its current code is fairly opaque -- it is hard to understand independently of the 2001 paper on pattern-matching compilation. I needed to understand the code in the context of fixing #7241, and it is reasonably likely that a future fix would touch this function -- so the fix would be difficult to review if the function is hard to understand.
The present PR documents this function and refactors it slightly for readability; it should not change its behavior in any way, and it should be reasonably easy to check this (this is just light, local code movement).

A possible review workflow would be as follows:

  1. try to read the current definition of mk_failaction_pos in trunk and see what you can make sense of
  2. read the definition in this PR (with all commits applied at once) and see if you find it easier to follow
  3. once this is done, review each commit independently for correctness (it should clearly preserve the current behavior)

Checking the accuracy of the new documentation (instead of believing it and reviewing the phrasing only) requires more expertise in the pattern-matching compiler, I will try to get @maranget or @trefis to have a look. Other reviewers should not feel responsible for this.

@gasche gasche force-pushed the matching-document-mk_failaction branch from 167f9e7 to c1b9a6a Compare September 27, 2023 13:13
unioning the specialized contexts of each failure pattern,
but more efficient -- the union would have a lot of
redundancy. *)
let i_fail_pat = list_as_pat i_fail_pats in
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list_as_pat is not a very informative name, and it is only used here. You could rename it to something like or_pattern_from_list and/or move the definition here.

Copy link
Copy Markdown
Member Author

@gasche gasche Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to please you and got pulled in a multi-file-change minor refactoring. I propose to send it as a separate PR to be evaluated on its own.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The would-be PR is my branch at https://github.com/gasche/ocaml/commits/minor-patterns-refactoring . I am waiting for the present PR to be merged before submitting. )

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@gasche gasche force-pushed the matching-document-mk_failaction branch from c1b9a6a to f4a8f55 Compare September 27, 2023 14:05
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 27, 2023

I rebased the PR on trunk (this was non-trivial due to the debug printing changes that went in independently).

@gasche gasche force-pushed the matching-document-mk_failaction branch from f4a8f55 to 1645553 Compare September 27, 2023 15:58
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 27, 2023

(I have gone through all of @lthls' comments. Thanks!)

Copy link
Copy Markdown
Contributor

@ncik-roberts ncik-roberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the old code and new code, and indeed I like the new code structure (and new comment) a lot more.

I've erred on the side on leaving nits about wording, as I'm a relative newcomer to this code and discussion may surface misunderstandings that I have.

unioning the specialized contexts of each failure pattern,
but more efficient -- the union would have a lot of
redundancy. *)
let i_fail_pat = list_as_pat i_fail_pats in
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


(* In line with the article and simpler than before *)
(* In [mk_failaction_pos partial seen ctx defs],
- [partial] is Total if we know the current switch
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In mk_failaction_neg, we use partial to identify that we can just return None, Jumps.empty and don't bother computing fail patterns. Do you know why we don't do that short-circuiting in mk_failaction_pos as well in the case that fail_pats < match_context_rows? If we trust Total, then it seems that short-circuiting would save us some needless work and keep contexts a bit smaller.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. It would also be possible to put this logic in the if sig_complete then check in the caller of mk_failaction_pos (if sig_complete || partial = Total then ...). (Something similar is done in combine_variant.) I would need to ask Luc if he considered this; and probably will. This could lead to a small code improvement in some cases of matching on an instance of a GADT type, where some cases can be eliminated by typing.

Comment on lines +2778 to +2782
Through its jump summary, [mk_failaction_pos] propagates "negative
information" about the constructors not taken. For example, if
a first submatrix contains a match on the [None] constructor, the
exit point selected for failure will get provided the context
information that its value must be [Some _].
Copy link
Copy Markdown
Contributor

@ncik-roberts ncik-roberts Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last sentence isn't quite accurate. Jumps from other handlers to that exit point might be None. It's more that jumps from this switch to that exit point must be Some _.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempted reformulation:

For example, if a switch only accepts the [None] constructor, [mk_failaction_pos] generates a failure clause along with context information that the value reaching the failure clause must be [Some _].

@gasche gasche force-pushed the matching-document-mk_failaction branch from 1645553 to b05c539 Compare April 1, 2024 20:45
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Apr 1, 2024

@ncik-roberts thanks! I took your comments into account and rebased the PR.

(Hopefully this will encourage a maintainer, possibly @lthls, to approve and merge.)

@gasche gasche merged commit ed372d3 into ocaml:trunk Apr 2, 2024
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Apr 2, 2024

Merged! Thank you both.

ncik-roberts added a commit to oxcaml/oxcaml that referenced this pull request Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants