Skip to content

Decide unboxing of let-bound expressions based on their Cmm translation#2165

Merged
lthls merged 4 commits intoocaml:trunkfrom
alainfrisch:unbox_cmm
Sep 17, 2019
Merged

Decide unboxing of let-bound expressions based on their Cmm translation#2165
lthls merged 4 commits intoocaml:trunkfrom
alainfrisch:unbox_cmm

Conversation

@alainfrisch
Copy link
Copy Markdown
Contributor

Previously, for a Clambda let-binding let x = e1 in e2, Cmmgen would look at the e1 Clambda expression to decide whether it is worth/possible to unbox the binding. This requires to "predict" at the Clambda level if an expression produces a boxed number, while this is really a property of the Cmm code. The analysis tried to replicate the behavior of the Clamdba->Cmm translation, but missed some cases such as z in:

let f x y =
  let z =
    let a = x +. y in
    let b = x *. y in
    if a <= b then
      a
    else
      b
  in
  z +. 1.

(It turns out that, recursively,a and b will be unboxed, so accessing them will require a re-boxing, but when analyzing the let z, we still don't know that.)

The PR proposes to translate the bound Clambda expression to Cmm and analyze the result to decide if the binding should be unboxed. This avoids a duplication of logic and fixes the code above.

Side-note: it is a bit unsatisfactory to reverse engineer boxing (but this was already the case before), and one might want to discuss keeping a symbolic representations of the boxing/unboxing operations in Cmm and lowering them later (as a simple Cmm->Cmm rewriting pass, or during the selection pass).

@alainfrisch
Copy link
Copy Markdown
Contributor Author

I've also added a version of #2162 (unbox across static handlers), but now using the same unboxing strategy as for let bindings.

cc @chambart

@alainfrisch alainfrisch changed the title [WIP] Decide unboxing of let-bound expressions based on their Cmm translation [WIP] Decide unboxing of let-bound expressions based on their Cmm translation + unbox across static handlers Nov 26, 2018
@alainfrisch
Copy link
Copy Markdown
Contributor Author

@kayceesrk : would you be as kind as to add this PR to the benchmarking on http://ocamllabs.io/multicore/?

@alainfrisch alainfrisch force-pushed the unbox_cmm branch 2 times, most recently from bac4151 to 87c36b6 Compare November 27, 2018 17:18
@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 28, 2018

This is no criticism of @alainfrisch, but I find it mildly inappropriate to use @kayceesrk's time for non-multicore-related benchmarks because the compiler team uses no easily-deployed compiler benchmark tool. Is there an easy way to use a clone of the ocamllabs.io/multicore system (or some other, but maintained by someone) that gives what-to-benchmark control to compiler developers, so that @alainfrisch could directly start the measurements he was requesting above?

@alainfrisch
Copy link
Copy Markdown
Contributor Author

@gasche cf #2143 (comment)

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Mandatory stupid micro-benchmark:

let foo () =
  let r = ref 0. in
  let f x =
    r := !r +. x;
    if !r >= 1000. then r:= !r -. 1000.
  in
  for i = 1 to 100000 do
    match i with
    | 0 -> f 0.
    | 1 -> f 1.
    | n ->
        let a, b =
          if n > 100 then float n, float (n + 1)
          else float (n + 1), 42.
        in
        if n > 1000 then f (a *. b)
  done;
  !r


let () =
  for i = 1 to 1000 do
    ignore (foo ())
  done;
version timing
OCaml 4.07 1.2s
Trunk (after #2143) 0.40s
This PR 0.25s

With this PR, the main function foo does not allocate at all (except for boxing its result).

@alainfrisch alainfrisch changed the title [WIP] Decide unboxing of let-bound expressions based on their Cmm translation + unbox across static handlers Decide unboxing of let-bound expressions based on their Cmm translation + unbox across static handlers Nov 30, 2018
@alainfrisch
Copy link
Copy Markdown
Contributor Author

@chambart @mshinwell do you think you could give a look to this PR?

@lthls
Copy link
Copy Markdown
Contributor

lthls commented Nov 30, 2018

@gasche The flambda benchmarks could be used instead (its infrastructure was actually built upon and improved for the multicore benchmarks), and allowing compiler developers to add/remove benchmarks would be as easy as giving them commit/merge rights to OCamlPro/ocamlbench-repo. Alternatively, anybody can submit a pull request to add a branch to test (there are several PRs that have been merged doing just this, for inspiration).

Although benchmarking the official compiler against proposed changes is already the purpose of the flambda benchmarks server, if someone wants to host another benchmarking server more clearly managed by the official OCaml team, we would be willing to help with setting up the infrastructure too.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 30, 2018

@lthls personally I would be very happy to use the flambda benchmarks. Where is the short documentation on how to add a new ocaml/ocaml PR there, with pointers to those pull requests you mention for inspiration?

("I would be happy to use" is more or less a figure of speech here: I don't personally submit many optimizations-related PRs, but others could be interested if the bar of entry wasn't too high. Clearly @alainfrisch doesn't think of your tool as something he could use easily, which suggests that something is slightly off right now. Thanks for replying here, which is a step to fix this!)

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 30, 2018

Maybe we could add this documentation (on how to submit a PR to the flambda benchmarks) to our CONTRIBUTING.md, or at least add a pointer to some documentation somewhere.

@lthls
Copy link
Copy Markdown
Contributor

lthls commented Dec 3, 2018

I'm working on a pull request to add some information to CONTRIBUTING.md, but it looks like there are a few issues at the moment with our scripts (the version of trunk being benchmarked points to a commit from July at the moment, which is not intended), so I'll wait until we've fixed these issues before publishing the PR.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

@chambart @mshinwell : do you think you might be able to review this PR?

Copy link
Copy Markdown
Contributor

@lthls lthls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a few things that could be improved:

  • There is a slight difference in the handling of raise: it used to be classified as No_return in is_unboxed_number, whereas now it falls into the default No_unboxing case of is_unboxed_number_cmm. I don't know how much of a difference it would make in practice, but it should be easy to make a special case for Cop(Craise) anyway.
    Here is a micro-benchmark showing a difference:
let improbable () = Sys.opaque_identity false

let id x = x [@@inline]

let () =
  let x0 = Gc.minor_words () in
  let r = ref 0. in
  for i = 0 to 99999 do
    r := id (if improbable () then raise Exit else !r +. 1.)
  done;
  let x1 = Gc.minor_words () in
  Printf.printf "Minor words: %f\n%!" (x1 -. x0)
  • There is a small issue with assuming the body of a trywith block to be in tail position, but this is a matter of properly documenting the behaviour, not a bug in the code (see inline comment).

I didn't notice any other problem with the patch, but I haven't reviewed it in depth.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

There is a slight difference in the handling of raise

Thanks. I've fixed that by extending the two new *_tail iterators in Cmm to let them know that Craise never return a value.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

@lthls : thanks for the review, and sorry for the slow reaction. Now that you have Approver power, can you formally approve this PR if you are happy with the latest state?

I've rebased and fixed a number of conflicts, in particular with #2308.

@mshinwell
Copy link
Copy Markdown
Contributor

@alainfrisch Could we please hold off merging this until the debugging changes to Cmm and Cmmgen have settled down? I've had to write a new version of #2294, which includes changes to the Cmm language and Cmmgen again, and there is some urgency to get this resolved due to blocking many other patches.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Looking quickly at changes to Cmm/Cmmgen in #2294, there should be no (serious) conflict with this one. I'm not strongly opposed to waiting a bit more for this PR, but it gives an immediate improvement and it has been ready for four months, so I'm not sure that adding it to the list of things blocked by #2294 is the best approach, one should rather try to decouple issues and merge them when they are ready.

@mshinwell
Copy link
Copy Markdown
Contributor

mshinwell commented Mar 25, 2019

@alainfrisch You're probably not looking at the revised version of the code for #2294 though, which I think is going to conflict. I'm hoping it won't take too long to get this new version of #2294 reviewed and merged, and I'm certainly happy to help fix the conflicts in this one after that. (Edit: so we can make sure this gets in for 4.09 too.)

@alainfrisch
Copy link
Copy Markdown
Contributor Author

I've rewritten the history to have just two meaningful and standalone commits for the two parts. Do you really prefer two PRs?

@mshinwell
Copy link
Copy Markdown
Contributor

I think so -- that's our standard practice, no? These features I hope are orthogonal to each other...

I'm ready to start reviewing pretty soon, by the way.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

These features I hope are orthogonal to each other...

The second commit is based on the first one, though.

I've kept only the first part in the PR, and move thed second part in #8735 (which also contains commits from this PR).

@alainfrisch alainfrisch requested a review from mshinwell June 14, 2019 13:30
@alainfrisch alainfrisch changed the title Decide unboxing of let-bound expressions based on their Cmm translation + unbox across static handlers Decide unboxing of let-bound expressions based on their Cmm translation Jun 14, 2019
@alainfrisch
Copy link
Copy Markdown
Contributor Author

Gentle ping to @mshinwell or whoever else (@lthls ?) interested in reviewing this PR and its companion #8735.

Copy link
Copy Markdown
Contributor

@lthls lthls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Globally nice changes.
In addition to a few specific remarks, I'll note that this work makes it easier to separate unboxing from translation. I don't think it's worth pushing it further and make unboxing a separate pass at the moment, but it's a good thing in my opinion.

(I had this review done a while ago, but I forgot I didn't submit it. Sorry for the delay.)

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Thanks @lthls for your positive review! I've integrated your comments (last commit). If you are ok, can you merge?

@lthls
Copy link
Copy Markdown
Contributor

lthls commented Sep 16, 2019

I'm going to merge this soon. On the kind of merge: I tend to prefer squash and merge, but I've seen that regular merges are quite common. Is there a global policy or are we free to choose based on personal taste ?

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Thanks!

I'm not aware of a global policy but I also usually prefer squash-and-merge, and for this specific case, this seems to be right choice (there is only one "real" commit).

@gasche
Copy link
Copy Markdown
Member

gasche commented Sep 16, 2019

It's important for me that there is something that looks like a "merge commit" in the result, notably that contains the number of the Github PR for later reference (merge commits from github always have that). I think that squash-and-merge is fine in that regard -- rebase-and-merge is the one that should be avoided.

@lthls lthls merged commit 5c031d2 into ocaml:trunk Sep 17, 2019
@lthls
Copy link
Copy Markdown
Contributor

lthls commented Sep 17, 2019

Squashed and merged, then. Thanks @alainfrisch for the PR.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Great, thanks to you for the review!

If you still have some spare cycles, #8735 is waiting for you 😄 It is a spin-off from this PR (see the benchmark above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants