Classpath improvements by mkeskells · Pull Request #5957 · scala/scala

mkeskells · 2017-06-22T22:17:42Z

This PR provides a series of improvements to classpath handling

12% reduced CPU, 23% reduction in allocation overall based on the figures in comments below

A clean implementation of ClassPath without AbstractFile dependencies in the implementation ( still required in the specification though)
Improvements to the data structures in the classpath implementation to facilitate efficient scanning and caching
All new Classpath implementation allow for caching on lookups avoiding expensive aggregation and/or file access
no locks maintained o ZipFiles to aid windows IDEs
Use of NIO to speed file access
Ability to scan files and dirs in background threads as this is not AST affecting

I had intended to do some more work on the prior to PR but as @retronym opened #5956 it seems worthwhile to combine these 2 approaches

mkeskells · 2017-06-22T22:35:13Z

A slightly earlier version of this change had the following performance measurements

this benchmark included some other changes which are not related and and more WIP, so I have removed then from the PR

I will run a benchmark on this overnight and post up when I can, but here are the stats for a full compile and a focus on the xsbt-dependency phase, which is the star

after 10 90%, full cycle

                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baseline  510        8,592.24 [-2.68% +2.96%]            8,467.01 [-2.38% +2.97%]            2,588.77 [-0.04% +0.04%]
        00_linker_cp_raw2  529        7,761.38 [-6.41% +9.11%]            7,558.16 [-6.56% +8.12%]            2,124.32 [-0.06% +0.06%]

after 10 90%, phase xsbt-dependency, no GC
                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baseline    16            756.67 [-1.54% +1.75%]                758.79 [-1.16% +2.96%]                518.43 [-0.01% +0.01%]
        00_linker_cp_raw2    18            183.19 [-10.51% +13.15%]              183.16 [-14.69% +10.90%]               57.00 [-0.45% +0.18%]

mkeskells · 2017-06-23T00:18:53Z

fuller results

Full performance figures, to 100 iterations of akka actor,

As a highlight the 3 rows represent the baseline, this change with no args and this change with extra args "-Yclasspath-cache-enabled", "-Yclasspath-top-prefetch", "-Yclasspath-raw-jar", "-Yclasspath-raw-dir"
Which enable respectively intra JVM jar cache, background cache population for the aggregate classpath, BasicClassPath implementation of a Jar ClassPath and BasicClassPath implementation of a directory class path

The phases are (annoyingly) not ordered, but the highlights are ( based on the last 90 runs)
12% reduced CPU, 23% reduction in allocation overall
We see a large reduction in sbt usage, but also in typer which is expected, and jvm which surprised me

ALL 100 cycles

                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baseline   2700       9,224.06 [-13.60% +276.82%]      8,964.06 [-12.50% +248.27%]      2,585.83 [-0.49% +15.01%]
             00_cp-noargs   2700       9,098.14 [-14.30% +325.31%]      8,823.59 [-13.05% +280.37%]      2,753.42 [-0.41% +13.22%]
                00_cpargs   2700       8,198.62 [-12.88% +302.70%]      7,989.22 [-13.75% +290.17%]      2,100.26 [-0.95% +19.20%]

( no outlyers, after warmup)
after 10 90% 

                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baselinw   2211       8,472.77 [-5.93% +5.94%]            8,303.82 [-5.54% +5.56%]            2,578.64 [-0.21% +0.06%]
             00_cp-noargs   2211       8,281.12 [-5.84% +6.91%]            8,136.19 [-5.71% +6.01%]            2,746.31 [-0.15% +0.16%]
                00_cpargs   2211       7,543.67 [-5.31% +4.66%]            7,362.85 [-6.41% +4.83%]            2,090.69 [-0.49% +0.52%]

after 20 90%

                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baseline  1968       8,437.27 [-5.54% +6.39%]            8,268.88 [-5.14% +5.82%]            2,578.45 [-0.21% +0.06%]
             00_cp-noargs  1968       8,237.95 [-5.35% +6.25%]            8,094.84 [-5.23% +5.20%]            2,745.83 [-0.14% +0.17%]
                00_cpargs  1968       7,518.85 [-5.00% +4.65%]            7,336.59 [-6.08% +5.21%]            2,089.35 [-0.43% +0.56%]


after 10 90%, phase xsbt-dependency, no GC

                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baseline    61        871.55 [-5.52% +7.27%]                868.34 [-6.43% +7.96%]                516.33 [-0.01% +0.01%]
             00_cp-noargs    63        845.80 [-5.50% +4.91%]                843.50 [-5.53% +5.59%]                658.14 [-0.01% +0.01%]
                00_cpargs    81        180.69 [-11.27% +4.07%]               180.36 [-13.37% +3.96%]                56.52 [-0.88% +0.31%]

after 10 90%, phase typer, no GC

                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baseline    79         2,504.58 [-8.97% +9.99%]            2,485.17 [-8.83% +8.77%]               621.99 [-0.16% +0.20%]
             00_cp-noargs    77         2,430.84 [-8.88% +12.01%]           2,421.88 [-8.39% +11.61%]              624.65 [-0.16% +0.22%]
                00_cpargs    78         2,368.58 [-8.33% +10.61%]           2,359.58 [-7.95% +9.92%]               604.80 [-0.35% +0.34%]

after 10 90%, phase jvm, no GC

                  RunName                        AllWallMS                             CPU_MS                         Allocated
       00_linker_baseline    59         1,683.90 [-5.43% +6.15%]            1,653.87 [-5.52% +6.76%]               445.21 [-0.83% +0.96%]
             00_cp-noargs    68         1,656.25 [-5.71% +7.36%]            1,627.53 [-5.92% +6.57%]               460.99 [-0.09% +0.08%]
                00_cpargs    41         1,612.99 [-4.08% +4.35%]            1,583.08 [-5.25% +3.64%]               449.43 [-0.28% +0.17%]

mkeskells · 2017-06-27T20:28:45Z

removed the default setting of the jar cache, that was failing test

retronym · 2017-07-04T00:00:13Z

The speedups here look great, but it will take some more time to polish and review is available to include in 2.12.3.

I'm not sure about the motivation for using WatchService here. As a point of comparison, the internal implementation of java.util.Zip uses the file timestamp and inode to key its internal caches, this seems simpler and sufficient for our use case. Can you comment on your rationale?

mkeskells · 2017-07-04T20:31:10Z

HI @retronym
I did have an implementation that worked on using the file time. This has a couple of disadvantages as I see it, in that it doesn't work for directories and it requires file access lookup whenever you compile

I was looking for something that would work in an IDE, and with our internal, highly customised, build environment, and to ensure that it will cope when we build direct to jar in scalac or via zinc via sbt/zinc#305

If you have a jar/zip file then we don't want to have it open as this fails for updates to the jar (in windows it blocks the update and in unix it misses the update), so Watcher seemed a simple solution. It also seems to be (from code inspection only) that the current FlatClassPath caching implementation is broken in the the internal cache is invalid

I did have an alternate implementation that uses last modified time and size to verify but this requires verification at the re-use of the jar, so it in not uncommon for us to have around 500 jars on a compile classpath, and several of these dont end up getting accessed in the actual compile (e.g. you know that you will use rt.jar, but probably not jsse etc). Watcher is efficient and a simple callback API

If we wait until the time that we need the content of the JAR then this means that there is less opportunity to spin this work onto a background thread, and scan the jar in advance. All of this can time is not currently captured by the benchmarks in this PR, but should benefit from parallelization, particularly when jars are on physical media or NAS

We have had some discussion outside this forum, and I can see 4 reuse models, and there could be selected based on settings and pattern match for the classpath element. This would enable a comparison of the different mechanisms as for performance and complexity, and behaviour in different environments

No reuse
watcher based
size & time based ( jar/zip only)
known static

4 is marginally faster and useful in non interactive environments where there is no external modification, like CI and command line builds, or when some of the jar are read only ( e.g. maven/ivy)

As the classpath implementation is simpler than I started with it should be simple to provide the cache validation mechanism a mixin trait

Any additions to this will have to wait for three weeks as I will have very limited network access for that time

mkeskells · 2017-08-10T20:29:41Z

/rebuild

mpociecha · 2017-08-10T20:33:44Z

src/compiler/scala/tools/nsc/util/CachedClassPath.scala

+    }
+  }
+}
+//class CachedLazyIndexedMapping[V <: Named] (miss : (String => Iterable[V]) )


The commented out code. Did you overlook it or is it expected (the WIP PR)?

this was an experiment - trying to optimise an access pattern, but I think it will be out of scope in this PR. I will delete when we get close to agreement on what is viable

it will be removed in the next push.
It was to get around the access of a Seq[File/class etc] by name
e.g. for findClass

replaced by an optimisation in PackageInfo

class PackageInfo(val packageName: String, val list: ClassPathEntries, val files: Seq[FileEntryType], val packages: Seq[PackageEntryImpl]) { lazy val filesByName: Map[String, AbstractFile] = files.map { file => file.name -> file.file }(collection.breakOut)

…e, reduced CPU usage

retronym · 2017-09-27T00:13:49Z

Moving to 2.12.5 as Mike and I are still working on the design for these changes.

SethTisue · 2018-02-22T21:02:48Z

needs rebase. tentatively moving to 2.12.6 milestone?

mkeskells · 2018-02-23T00:13:36Z

I will focus on this (or maybe a replacement) now that the backend changes are merged

adriaanm · 2018-05-30T07:57:29Z

@mkeskells: ping -- shall we close this in favor of the future replacement PR you mentioned? I'm trying to decrease the PR queue depth a bit

SethTisue · 2018-06-06T20:07:07Z

closing for now. once activity resumes we're happy to reopen or pursue in a new PR, as appropriate.

mkeskells · 2018-06-06T21:48:21Z

we can reuse this in the work that we are about to start, but the PR will not merge from here, so we can restart a new one when we have a proposal

scala-jenkins added this to the 2.12.3 milestone Jun 22, 2017

mkeskells mentioned this pull request Jun 22, 2017

Optimize classpath implementation to speed up SBT #5956

Merged

SethTisue added the performance the need for speed. usually compiler performance, sometimes runtime performance. label Jun 27, 2017

retronym modified the milestones: 2.12.4, 2.12.3 Jul 4, 2017

rorygraves force-pushed the 2.12.x_classpath2 branch from d7b47e7 to 43756dc Compare August 5, 2017 11:20

rorygraves deleted the 2.12.x_classpath2 branch August 5, 2017 11:23

mkeskells restored the 2.12.x_classpath2 branch August 10, 2017 20:25

mpociecha reviewed Aug 10, 2017

View reviewed changes

mkeskells force-pushed the 2.12.x_classpath2 branch from 43756dc to 6a53f4b Compare August 10, 2017 21:04

classpath uplift to support caching, file watcher, reduce memory usag…

e2196cd

…e, reduced CPU usage

mkeskells force-pushed the 2.12.x_classpath2 branch from 6a53f4b to e2196cd Compare August 10, 2017 22:02

collapse unused hierarchy

f6b7a56

mkeskells mentioned this pull request Aug 17, 2017

Improve Classpath Implementation scala/scala-dev#416

Open

1 task

adriaanm mentioned this pull request Sep 21, 2017

String Interpolation by macros #6041

Closed

retronym modified the milestones: 2.12.4, 2.12.5 Sep 27, 2017

mkeskells mentioned this pull request Jan 31, 2018

simple classpath tidyups #6294

Merged

SethTisue modified the milestones: 2.12.5, 2.12.6 Feb 22, 2018

SethTisue removed this from the 2.12.6 milestone Mar 23, 2018

SethTisue added this to the 2.12.7 milestone Mar 23, 2018

SethTisue closed this Jun 6, 2018

SethTisue removed this from the 2.12.7 milestone Jun 6, 2018

Conversation

mkeskells commented Jun 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkeskells commented Jun 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkeskells commented Jun 23, 2017

Uh oh!

mkeskells commented Jun 27, 2017

Uh oh!

retronym commented Jul 4, 2017

Uh oh!

mkeskells commented Jul 4, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkeskells commented Aug 10, 2017

Uh oh!

mpociecha Aug 10, 2017

Choose a reason for hiding this comment

Uh oh!

mkeskells Aug 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mkeskells Aug 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

retronym commented Sep 27, 2017

Uh oh!

SethTisue commented Feb 22, 2018

Uh oh!

mkeskells commented Feb 23, 2018

Uh oh!

adriaanm commented May 30, 2018

Uh oh!

SethTisue commented Jun 6, 2018

Uh oh!

mkeskells commented Jun 6, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mkeskells commented Jun 22, 2017 •

edited

Loading

mkeskells commented Jun 22, 2017 •

edited

Loading

mkeskells commented Jul 4, 2017 •

edited

Loading

mkeskells Aug 10, 2017 •

edited

Loading

mkeskells Aug 10, 2017 •

edited

Loading