Improve location information in Lambda code#2294
Improve location information in Lambda code#2294mshinwell wants to merge 13 commits intoocaml:trunkfrom
Conversation
|
Two further notes: this patch includes GPR#2284. Also, it adds two new flags ( |
|
Shouldn't locations be added on all constructors (and then switch to the usual representation with a record containing the location and the "desc")? For instance, why is there no location on Lassign? I suspect that having locations everywhere, even if not strictly needed for the debugger support, could benefit to e.g. Bucklescript (to produce sourcemaps). |
|
They might not be needed in all cases---I would have to think about each one---but in principle, I agree with what you are saying. The patch is in its current form because I can't really afford to spend any more time on this right now---it has taken a very long time to get even this far---and this version seems to produce pretty good results. |
715a459 to
31ab859
Compare
trefis
left a comment
There was a problem hiding this comment.
I'm sympathetic to @alainfrisch's suggestion. There currently seems to be no obvious rule to decide which lambda constructors are given an extra location, or two, or none, etc.
Plus Lifthenelse is a mess.
Otherwise things look mostly reasonable, apart from some code in Matching.compile_match which, I believe, incorrectly handles locations in the presence of guarded clauses.
bytecomp/lambda.ml
Outdated
| let rec make_sequence fn = function | ||
| [] -> lambda_unit | ||
| let rec make_sequence loc fn = function | ||
| [] -> lambda_unit loc |
There was a problem hiding this comment.
I was sad to see make_sequence take an extra argument, and even sadder when I saw what it was used for.
I wonder if using Location.none here would be acceptable. Any opinion?
There was a problem hiding this comment.
I'm not exactly sure, but I'd rather leave it as-is, given that it's been tested fairly extensively. Why is the extra argument problematic?
There was a problem hiding this comment.
Well, the API feels weird... I might be happier with
let rec make_sequence ~nil fn = function
| [] -> nil
| ... (* unchanged *)There was a problem hiding this comment.
I've looked at this again, and I think we can actually just use Location.none here. If the generated code ends up having a move at the end (to load the unit value), then Selectgen should propagate the debuginfo from the previous instruction onto the move, which should be fine.
bytecomp/lambda.mli
Outdated
| lambda * (int * (Ident.t * value_kind) list) * lambda * Location.t | ||
| | Ltrywith of lambda * Ident.t * lambda * Location.t | ||
| | Lifthenelse of | ||
| lambda * Location.t * lambda * Location.t * lambda * Location.t |
There was a problem hiding this comment.
I think that one deserves to be recordified. I initially "parsed" it as
(lambda * Location.t) * (lambda * Location.t) * (lambda * Location.t)but it looks like it's actually:
lambda * (Location.t * lambda) * (Location.t * lambda) * Location.tThere was a problem hiding this comment.
I think I will turn this one into a record.
There was a problem hiding this comment.
This will be resolved in the next version.
bytecomp/matching.ml
Outdated
| let {pm=this_match ; ctx=this_ctx } = divide ctx to_match in | ||
| let lambda,total = compile_match repr partial this_ctx this_match in | ||
| lambda, jumps_map up_ctx total | ||
| let (_loc, lam), total = compile_match loc repr partial this_ctx this_match in |
There was a problem hiding this comment.
Why is _loc ignored instead of being returned? The pattern doesn't match what one can see in do_compile_matching above.
There was a problem hiding this comment.
This has been fixed.
bytecomp/matching.ml
Outdated
| end | ||
|
|
||
| let compile_matching repr handler_fun arg pat_act_list partial = | ||
| let pats_act_list_of_pat_act_list (pat_act_list : pat_act_list) = |
There was a problem hiding this comment.
What exactly is the complaint here?
There was a problem hiding this comment.
Well the function name pats_act_list_of_pat_act_list is not very pleasant to read on the caller side.
Maybe we could find better formulations for your types pats_act_list and pat_act_list.
In general I'm not convinced by foo_list, why not use the more flexible foo list instead? So we could find a name for "a single pattern and the action to take" and "many parallel patterns and the action to take". Suggestions:
simple_clauseversusmulti_clause("clause" means "something with the shapefoo -> bar)pattern_actionversusrow_actionpattern_clauseversusrow_clauseclause1versusclause(urgh)
Note that for the function itself you could just write row_clause_of_pattern_clause and then do List.map on it at the callsite.
There was a problem hiding this comment.
Well, it's foo_list because I think the whole data structure should have a single name; it's just I didn't choose a good one. If you and/or @trefis agree on a name, I will gladly change it.
There was a problem hiding this comment.
(To clarify, I think the naming should reflect what the thing is, not what structure is being used to represent it.)
There was a problem hiding this comment.
Another thing that may work is type 'a case = ('a * lambda * Location.t) (why not 'a * lambda Location.loc?), so pattern case and pattern list case, with type 'a cases = 'a case list.
There was a problem hiding this comment.
Well the function name
pats_act_list_of_pat_act_listis not very pleasant to read on the caller side.
Or at the definition point.
I like the 'a case proposition.
In the end, I'd go with the following definitions:
type row = pattern list
type 'a case = 'a * lambda * Location.t
(* ... *)
let mk_row_case (pat, act, loc) : row case = ([pat], act, loc)And then just List.map mk_row_case (I'm not a fan of that name, but it's less painful than the previous one) on the few places that need it.
There was a problem hiding this comment.
I've dealt with all of this.
|
@trefis Thanks for the review; I've pushed changes to fix some of your comments. Could you please resolve the relevant conversations, and then I will attend to the remainder, which I think are more substantial? (I may not be able to come back to this GPR until next week.) |
|
(I also rebased this PR, which involved resolving some conflicts.) |
|
I only looked at the The synonym |
|
We could do that, and generally I would be in favour of such things (except without the extra type alias). However in this case I'm not sure if it's worth it. Bear in mind the proposed change increases allocation. I have less of an opinion about using this structure in the matching code itself, but again I suspect the trade-off in increased allocation may not be worth it. @trefis knows far more about this code than I do, and will have a more informed opinion. As long as this GPR is not felt to be materially decreasing the quality of either |
|
To reply to @gasche suggestion of using I also think the whole thing would look nicer if that change was made, however Mark doesn't seem like he wants to do that work; which I guess is lucky for me, because I'm not sure I'd have the patience to review even more such tedious changes. |
|
I'm feeling sorry for having jumped into this discussion, because now I'm in the position of the person that says things nobody wants to hear... I totally get the idea of saying "I improved this thing a bit, I don't have time for the further improvements you suggest right now so let's stay at that". It's pragmatic and efficient. But I don't think that the part of this PR that I looked at would qualify for that classification. It is rather a case of making things worse in the process of some Important Feature, when we have the impression that a better way exists. Sure, we can leave the cleanup to some other PR but, just as no one wants to do it now, no one will want to do it later and, if we merge now, we will just stay with a codebase that is noticeably harder to evolve and maintain. If this was a smaller patch, on the road for a less important Important Feature, I'm super-confident that I would be the annoying person to ask for a cleaner rewrite. It is not actually that much work, and keeping the compiler codebase nice and maintainable is arguably Even More Important than most Important Features we add these days. (And I think we should try harder to make things nice from the start when we touch the compiler codebase, instead of aiming for whatever works.) Some ideas:
|
|
A minor point:
I don't think we care at all about the cost of allocating one extra block per Lambda node, or (in the case of my more restricted proposal) one extra block per pattern clause and right-hand-side. |
|
I thought some more about the placement of
Adding a record indirection for all nodes therefore seems overkill at the moment. I think that is also likely to be significantly to the detriment of readability, although depending on the rationale, that might be an acceptable trade-off in the future. I must emphasise that this patch has already received a substantial amount of work (about two solid weeks)---it was far from trivial to thread the necessary location information through the pattern matching compiler. I found it extremely difficult to understand some of that, admittedly very clever, code due to extensive use of partial application and lack of explicit types. If we're worried about the maintainability of this area of code, then I'm afraid the place to turn attention to is the existing code, rather than being over-concerned about the relatively minor additions in this patch. I do not believe those additions will materially affect maintainability, at least once the |
|
For the record, I don't know if using a record wrapper in Lambda.t is the best approach to make code readable. What is immediately clear in the present patch is that you are passing term fragments and locations (or debug information) for these fragments separately, while it would be nicer if they were passed together as a single object, as it would make clear/obvious what locates what. One aspect of this discussion I don't really like is the use of the argument "I already spent two weeks on this". You have ambitious goals regarding gdb which require a ton of work, and this is laudable. Writing good code is more work than writing meh code, which means that you have a ton more work, and we should budget for that from the start! Having a ton of work to do does not give anyone license to write bad code, and having discussions on how good the code is, and possibly asking for changes (to, yes, a lot of code), should be on the table for any pull request. |
|
Let's place a moratorium on any further comments until this is revised. |
|
By popular demand, I've pushed a new version of this, which should be much more satisfactory. There were so many additional changes that I've just squashed everything down to a single commit; the best starting point for review is the base revision. I sat down with @lpw25 and we examined the Whilst making these changes I'm afraid I felt bound to improve some of the formatting of various functions in I believe I have addressed all of @trefis 's other comments, as above. |
gasche
left a comment
There was a problem hiding this comment.
I haven't looked at all of parmatch.ml yet, but this looks much nicer. Thanks!
asmcomp/cmmgen.ml
Outdated
|
|
||
| and unbox_float_block block = | ||
| let dbg = block.block_dbg in | ||
| Cmm.block dbg (unbox_float dbg block.expr) |
There was a problem hiding this comment.
(I don't know if this was already in the previous patch but this two-layer recursion scheme is actually quite elegant. Well done!)
| | Levent of lambda * lambda_event | ||
| | Lifused of Ident.t * lambda | ||
|
|
||
| and block = private { |
There was a problem hiding this comment.
When the "block" is a lambda node with a toplevel Location.t annotation, do the two locations necessarily match?
Why not go the full way and attach the location in a generic way to all lambda nodes (renaming block to lambda and lambda to lambda_desc)?
There was a problem hiding this comment.
There's no constraint with regard to the two locations you identify. It may be preferable for these locations in some cases to be distinct -- for example if one arm of an Lifthenelse was a function application surrounded by begin and end, then it would be reasonable for the location on the block (in the Lifthenelse arm) to be that of the begin but the location on the application (inside the block) to be on the f x or whatever it is.
There's no need at present to attach the information to all nodes---at least for the main purpose of this pull request. I did look at this, and concluded that it seemed unnecessarily disruptive.
I see, but this is not what happens in this PR, right? Anyway, we won't be able to do something more fine-grained than what the Typedtree keeps, which is currently one location for each node. The natural approach, in my opinion, would be to keep the same scheme as for Parsetree or Typedtree, i.e. attach the location information at each node, in a uniform way. If only for the sake of design uniformity across the compile code base. The introduction of "blocks" goes a long way towards this approach, so I don't understand whether you prefer not to do that because (1) this would be more (invasive) work and you really want to move on -- which I can fully understand, or (2) there are intrinsic benefits of not doing it like that (in which case, I wonder if the same arguments would apply to Parsetree and/or Typedtree). |
|
@alainfrisch I'm not actually sure I know the answer to the first question in all cases -- but I don't think it really matters. The location information is in any case very likely to require tweaking, quite possibly in ways we have not yet foreseen, once we get this whole package of work released. I explained in the previous comment that I see no technical merit for having anything beyond the existing "blocks" approach at this stage. I don't think we should be cluttering up intermediate languages with unnecessary information, both because it can materially impact upon code clarity at sites where terms are constructed and deconstructed, and because it is wasteful at runtime. I do also really want to move on, so we can get the benefits of this work to users. To this end, I would very much appreciate feedback on the actual logic within this patch, in particular the pattern matching compilation parts. The idea is that the locations put on terms by the match compiler should reflect the progress of a pattern matching at a fine level to the fullest extent possible. |
In the middle-end languages (lambda, clambda, flambda, cmm) it doesn't make sense to talk about the locations of many of the constructs. Things like lets and variable uses are artefacts of the IRs themselves not constructs in the original source, and not things that produce instructions in the output. The things that locations do make sense for are essentially the operations (e.g. primitives and function applications) which should always correspond to a corresponding expression in the source and which directly produce assembly instructions, along with parts of the control structure (e.g. the branches) which also appear in the source and can be "blamed" for instructions in the output. The current design of this PR attempts to put locations on all of those places. |
bytecomp/matching.ml
Outdated
| x1::jumps_union rem1 env2 | ||
| else | ||
| x2::jumps_union env1 rem2 | ||
| type jumps_union = (int * ctx list) list |
There was a problem hiding this comment.
This type declaration is wrong (the function does a _union but the datatype is not a union); it should be jumps, which is already defined above, and you could use this to annotate env1 if you wanted to.
There was a problem hiding this comment.
I've fixed this and actually made a new Jumps module which cleans this up rather more satisfactorily. It turns out that the type can just be made abstract.
| fprintf ppf "@[<hv 1>case int %i:%a@ %a@]" | ||
| n | ||
| print_loc act.block_loc | ||
| lam act.expr) |
There was a problem hiding this comment.
you should have a block or print_block function that does the right thing so that this can be rewritten as
fprintf ppf "[@<hv 1>case int %i:%a@]" n block act|
Can people involved in this PR arrange to make progress? It is blocking related work, but also unrelated one such as #2165. So please don't let this PR stall. |
|
I don't think this PR should be considered blocking for the unrelated #2165. It would be blocking if the other PR depended on a change in this PR (this is not the case), and it would be "nice to wait" to avoid a rebase if the present PR was to be merged imminently (I don't think this is the case either). Let's set #2165 free to live its own exciting PR life. As to the status of the current PR: personally I think that having more location information in Lambda code is a desirable feature for a variety of use-cases (not in the sense that I can personally name any one use-case that I deeply care about right now, but in the sense that I am convinced reasonable use-cases exist), and I would be willing to finish the review work for this PR (getting Thomas to review my own changes to switch.ml; I also have one-fourth to one-third of the code still to review, which is also likely to require cleanup changes after review) and to get it merged. On the other hand, clearly @xavierleroy is not happy with the idea of adding more location information in Lambda code, independently of how invasive/ugly the change is (the first version was fairly ugly but I think it can be made a very reasonable, readability-preserving change). I think that it could be possible for me to sneak the commit inside the Treasure Cave without awakening the dragon -- or maybe to persuade the dragon to look elsewhere -- but I have not tried yet. I don't think waiting a few extra weeks would do us any harm. |
|
@gasche Let's please have some discussion between the various people involved in changes (or potential changes) to |
|
The dragon metaphor is no quite right. The duty of the dragon is to make sure that nothing is taken off the treasure cave. My duty is to make sure that only treasure, and not litter, is stored in the treasure cave. You don't change a part of the compiler as sensitive as the pattern-matching compiler just on a hunch that more location information must be good and may be useful in an indefinite future. |
|
Independently of arguments about dragons, one thing to note is that we could do large-scale testing on pull requests such as this to demonstrate that the generated code does not change. |
My understanding is that this PR does fix bugs in the locations that OCaml currently outputs for things (maybe in combination with another patch to propagate the data from cmm to the asm I'm not sure). Fixing these bugs seems worth at least the relevant parts of this code -- which I think is basically the non- |
I intend to keep my word on that part, however considering the recent discussions about this PR, I think that the changes to
If that part was to be taken out¹, would Heimdallr agree to it being merged? 1: And perhaps some examples were given of situations were this patch does indeed improve the reported locations? |
|
I've removed the Almost all the changes to I would like to argue again that fine-grained location information is of real value for applications other than single-stepping in debuggers. (In fact, the position of single-step points should be marked independently from the location information, but only now are debugging information formats evolving to the point of being able to support that.) This morning, I heard another potential application, which was recovery of accurate source locations as to where particular optimisations had applied. This was in the context of assessing the applicability of existing, mainly Cmm-based, optimisations and deciding whether some new optimisation was worthwhile. |
|
This patch also helps to fix existing problems with the compiler's line number information output (along with a small subsequent patch to
|
|
One use-case I have in mind for location/debug information in pattern-matching (including the Switch part) is profiling tools that let users study mispredicted branches in the binary, and use DWARF information to let users go back to the source code for each branch. I have seen real-world cases of users wondering about misprediction in pattern-matching-generated code (namely, a case where using a polymorphic variant was faster than a normal variant due to branch predictors fitting the dynamic distribution of input data in the tree-of-tests generated for the (non-consecutive) polymorphic variant tags, beating the jump-table implementation used for (consecutive) variant tags). |
|
@mshinwell why this was closed? There is some work the replaces this? I was rebasing this patch to 4.10 |
* Backport 12508. shape-index tests + cms not working * Fix tests format for shape-index * store occurrences as an array instead as of a list * Reimplement cms files * extensions have no shape for now * Update .depend * code review * fix build * restored objinfo * bootstrap * fix build_upstream_closure
This patch enhances the
Lambdalanguage so that finer-grained location information can be propagated to the middle end. Detailed location information of this form is required in order to produce a good user experience in the debugger.At the same time we aim to ensure that our generated
Lambdacode contains accurate location information. At present, there are some significant gaps in coverage, particularly in code generated by the pattern matching and switch compilers. This patch fixes that. It is possible (probable) that some more tweaking is required as this is tested on more examples, but this should be a good start. It's pretty nice being able to step through complicated matches in the debugger.This patch touches
CmmgenandStrmatcha bit: this is because they useSwitch. I also introduced a handful of new types in the pattern matching compilation code to help make it easier to understand.@trefis is on the hook to review this one, with thanks.