SD-225 Use a "lzycompute" method for module initialization#5398
SD-225 Use a "lzycompute" method for module initialization#5398retronym merged 1 commit intoscala:2.12.0from
Conversation
2f20ab6 to
ee468c1
Compare
ee468c1 to
03de3db
Compare
|
Looks like it needs a junit test update. Otherwise LGTM |
|
LGTM, too |
The monitors and module instantation were inliuned into the module accessor method in b2e0911. However, this seems to have had a detrimental impact on performance. This might be because the module accessors are now above the "always inline" HotSpot threshold of 35 bytes, or perhaps because they contain monitor-entry/exit and exception handlers. This commit returns to the the 2.11.8 appraoch of factoring the the second check of the doublecheck locking into a method. I've done this by declaring a nested method within the accessor; this will be lifted out to the class level by lambdalift. This represents a slight deviation from the implementation strategy used for lazy accessors, which create a symbol for the slowpath method in the info transform and generate the corresponding DefDef as a class member. I don't believe this deviation is particular worrisome, though. I have bootstrapped the compiler through this commit and found that the drastic regression in compiling the shapeless test suite is solved.
03de3db to
a5bb6e0
Compare
|
👏 |
|
After this I'm seeing, 2.12.0 branch Which is a significant improvement over RC1 but still almost twice as long as 2.11.8. |
|
@milessabin Did you use a bootstrapped compiler when collecting those numbers? Until #5403 is merged, we're still using 2.12.0-RC1 as STARR. |
|
I did a publishLocal using @adriaanm's recipe, then rebuilt with |
|
If you have an alternative recipe for testing please let me know and I'll try it out. |
|
I've just tested against 2.12.0-d274405-SNAPSHOT (the artifact from #5403). I benchmarked the cold performance of compiling shapeless core with this version and 2.11.8. Results: https://gist.github.com/retronym/220345f697a18a0facc259556a8a73b2 This does show a performance regression of around ~15%, which I'm interested to investigate further. For the tests project (only one iteration this time) That suggests a similar slowdown. But neither of these matches with your report of a 2x slowdown. |
|
My latest timings ... For 2.12.0 + #5403 + dist/mkPack, latest from shapeless master, For 2.11.8, latest from shapeless master, So, I'm seeing about the same slowdown as you for coreJVM/compile but a much bigger slowdown for coreJVM/test:compile ... consistently 1.6x with my latest 2.12.0 build. Are we compiling the exact same tests? Is it possible that you excluded some to avoid the constant val elimination problem? Alternatively, am I doing something wrong? |
|
(nb. the |
|
@milessabin Just doing Here are the compiler command lines I used (includind selection of test source files) https://gist.github.com/retronym/8bca64252f68fbfb354593af7b53ae74 I was also compiling outside of SBT. It is conceivable that there is an additional regression in the incremental compiler (which adds two compiler phases to extract dependencies and APIs from the compiled files). |
|
I'm afraid I can't reproduce your results, even building with scalac outside SBT. I'm quite prepared to believe that -optimise (or rather the lack of it) is the issue here, but if it is then none of the instructions above seem to be enough to coax it into being applied. |
|
Could you publish a snapshot that I can test here? |
|
Compiling with plain scalac, command line similar to yours above I see, So 1.2x in this case ... does that line up with what you're seeing? In SBT I'm seeing, ie. 1.7x, which suggests that your hypothesis that there's a regression in the incremental compiler might be right. |
|
I have since found what might have been the root cause of this problem: our code gen from I have proposed a fix for that in #5417 I think this change makes sense in any case, but if anyone want to revisit the decision, we could do some benchmarking without the |

The monitors and module instantation were inliuned into the module accessor
method in b2e0911. However, this seems to have had a detrimental impact on
performance. This might be because the module accessors are now above the
"always inline" HotSpot threshold of 35 bytes, or perhaps because they
contain monitor-entry/exit and exception handlers.
This commit returns to the the 2.11.8 appraoch of factoring
the the second check of the doublecheck locking into a method.
I've done this by declaring a nested method within the accessor;
this will be lifted out to the class level by lambdalift.
This represents a slight deviation from the implementation strategy used
for lazy accessors, which create a symbol for the slowpath method in
the info transform and generate the corresponding DefDef as a class
member. I don't believe this deviation is particular worrisome, though.
I have bootstrapped the compiler through this commit and found that
the drastic regression in compiling the shapeless test suite is solved.
Fixes scala/scala-dev#225