Skip to content

WIP New classpath and sourcepath implementation#3962

Closed
afitz wants to merge 62 commits intoscala:2.11.xfrom
afitz:megatron-0.1
Closed

WIP New classpath and sourcepath implementation#3962
afitz wants to merge 62 commits intoscala:2.11.xfrom
afitz:megatron-0.1

Conversation

@afitz
Copy link

@afitz afitz commented Sep 3, 2014

This is WIP branch pushed to consult it with Grzegorz Kossakowski.

Current implementation of classpath is not very efficient and it's quite abstract because it was created when e.g. .NET was supported. Therefore one of goals of new implementation is to improve efficiency by creation of dedicated classes for various file types characteristic for Scala with JVM backend - what allows us to create more efficient, more low-level operations. The second goal is to reduce memory consumption. To do that there's added caching for jar/zip files. So that we can reuse existing instances, when we create entries for the same files once again. Hopefully it will lead to significant reduction of used memory also in the case of e.g. ScalaPresentationCompilers.

One of the most important things took into account during creating this implementation was adding it as optional feature, which can be marked as experimental and turned on using flag. At the beginning users will use old classpath implementation by default and they will be able to use new implementation when:

  • they will want to test it (feedback)
  • they will want to gain benefits which it provides

There are a lot of TODOs, some unused classes, temporary (?) benchmarks and maybe not very pretty solutions and names, but it's something, what will be discussed with Grzegorz. Definitely we need more unit tests. Right now existing tests (including partest) and also other, manually executed tests pass.

THIS PROGRAM IS SUBJECT TO THE TERMS OF THE BSD 3-CLAUSE LICENSE.
IN ADDITION, THE FOLLOWING DISCLAIMER APPLIES IN CONNECTION WITH THIS PROGRAM:
THIS SOFTWARE IS LICENSED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OF NON-INFRINGEMENT, ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THIS SOFTWARE MAY BE REDISTRIBUTED TO OTHERS ONLY BY EFFECTIVELY USING THIS OR
ANOTHER EQUIVALENT DISCLAIMER IN ADDITION TO ANY OTHER REQUIRED LICENSE TERMS.

THE FOLLOWING DISCLAIMER APPLIES TO ALL SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE:
THIS SOFTWARE IS LICENSED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OF NON-INFRINGEMENT, ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THIS SOFTWARE MAY BE REDISTRIBUTED TO OTHERS ONLY BY EFFECTIVELY USING THIS OR
ANOTHER EQUIVALENT DISCLAIMER IN ADDITION TO ANY OTHER REQUIRED LICENSE TERMS.
ONLY THE SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE, IF ANY, THAT ARE ATTACHED TO (OR OTHERWISE ACCOMPANY) THIS SUBMISSION (AND ORDINARY
COURSE CONTRIBUTIONS OF FUTURES PATCHES THERETO) ARE TO BE CONSIDERED A CONTRIBUTION. NO OTHER SOFTWARE CODE OR MATERIALS ARE A CONTRIBUTION.

If you look at the implementation of that method and its usage
its clear that it should have been named `findClassFile` from the
beginning because that's what it does: find a class file and
not a source file.
Introduced ClassfileLookup interface which allows us to remove
dependency on particular classpath implementation for class file
lookups.
Since .NET backend got removed this method is a no-op.
We need to use full name of the package if we use flat classpath.
It was removed temporarily to make sure that flatClasspath is being
used. The better way is to have implementation for both and guard
them with runtime assertion.

Also, switched to using flat classpath by default.
So it's not recomputed on every access.
It's not being used yet because we have to hook into PathResolver
and construct those objects for each classpath element.

Also, this is just to get us started with the whole idea of replacing
classpath implementation. I expect we'll get specialized versions
for JarFlatClasspath and DirectoryFlatClasspath that are optimized
for the fact that we are dealing with a jar file or a directory.
Extracted methods responsible for classpath element creation into
separate trait that can be implemented to create either recursive
(old classpath) or flat (new classpath) elements.
We don't need path-dependent entries. Actually, we want to aggregate
entries from several classpath elements.
By overriding a method and setting its type to Nothing we make it
impossible to override.
Implement various lookup methods in ZipArchive. Most of the logic
was already there so I just had to put things together.

One noticeable change is that directory entry is put under
s"{dirName}/" name instead of just dirName. That's required by
lookup methods and it was probably an omission.
So they get good-looking toString representation.
Bugs are:

  * filter out directories which are not valid package names
  * filter out files which are not valid class files
  * entries shouldn't be inner classes of AbstractFileFlatClasspath
    class because that messes up their equality (due to path-dependence)
  * not every classpath element contains given package hence
    getDirectory returns an Option[AbstractFile]
Implement AggragetFlatClasspath which aggregates a sequence
of other FlatClasspaths.

It's vastly inefficient but that's ok for now.
Before, we used WrappingFlatClasspath which would simply wrap an old
implementation of a classpath and expose it through FlatClasspath
interface.

In order to do that, we had to implement an alternative
PathResolver which resolves classpath elements to FlatClasspath
implementations.
Also, we traverse underlaying zip file only once and keep the
index (entries) in memory.
This way we can have jars (instead of class dirs) on a classpath
for SymbolTable tests.
This way we can compare the old and the new classpath implementation
without distorting the results by relying on caching in Java's
ZipFileIndex.
It turned out that going through Path is expensive because
it creates a new java.io.File and makes filesystem calls.
Moved that implementation so it's not needed in any object.
Add FlatClasspath implementation that delegates (a little bit poorly)
to javac's ZipFileIndex. That implementation of zip file index is
really fast due two reasons:

  * inherently it's written for performance, it populates the index
    in single, focused pass
  * it caches the index per jar file, so the next time we ask for
    the same jar we don't need to rescan it
It's a very poor method of measuring performance but it gives
some rough idea.
mpociecha added 15 commits July 30, 2014 16:49
By the way code related to hacks has been commented out to test whether it's really and still needed.
…ds like "scala foo.MyMain"

Added some mock, temporary things like source path
- scalap can work also using flat classpath
- Added simple apps to examine classpath implementations
- Reorganised entries
It works partially. In general there are problems with jar files (for example underlying file of jar loaded as ManifestResources is null so we gen NullPointerExceptions). Many tests in partest fail because of these problems. There are still some test things and a lot of TODOs.
…y containing both

There are still problems with zip/jar abstract files without real underlying file (it has to be implemented).
Also manual test commited lately still doesn't work for flat classpath due to problems with BooleanUtils.java from jar with sources of apache commons (symbol not found), although packages and entries processed by symbol loader are exactly the same as in the case of old classpath implementation.
There was really stupid bug - we always was deleting last 6 chars from file name instead of real extension
Also set caching as a default behaviour.
@scala-jenkins scala-jenkins added this to the 2.11.3 milestone Sep 3, 2014
@mpociecha
Copy link
Contributor

Regarding conflicts - I wasn't rebasing this branch for some time. I'm going to rebase it in the case of normal (not WIP) pull request.

Edit: Or, looking at all these messages from scala-jenkins, before next update.

@gkossakowski
Copy link
Contributor

I'll have a look at this next week.

@retronym
Copy link
Member

For future reference, it would be better to perform this review in a different repository until you are ready to submit the branch for testing. This PR took a big chunk of our build resources for a few hours as it has a large number of commits.

That said, I'm excited to see this work resuming!

@gkossakowski
Copy link
Contributor

Overall is going in the right direction. We definitely need to squash a lot of commits from this branch. The 4cdec6c and 2734bd7 are independent from the rest of this PR so it can be cherry picked and submitted as separate PR. Could you do that?

Please rebase the PR on top of latest 2.11.x branch and squash many of the commits that make sense to be squashed. You don't need to achieve perfect commits in one step, but squashing the obvious candidates (those that fix up some other commits) is a good start.

Lastly, what's the status of partest tests? How many of them are passing and how many of them are failing when flat classpath is enabled by default? Can you bootstrap the compiler with new classpath implementation enabled?

@mpociecha
Copy link
Contributor

As I wrote via mail some time ago: All tests from partest pass except these ones:
java 6
[partest] !! 1435 - run/t7634.scala [output differs]
1 tests failed

java 7
[partest] !! 570 - neg/t6289 [output differs]
[partest] !! 581 - run/repl-javap-app.scala [output differs]
2 tests failed

which are not related to my changes. They were failing on my machine with Windows even on master. And I have commit which is alread merged to 2.11.x and everything passed on Jenkins while I also had problems with above tests on my machine when running tests before creating this previous, already merged PR.

"Can you bootstrap the compiler with new classpath implementation enabled?"
I think so. Didn't I have this in added benchmarks? And doesn't partest do that?

@gkossakowski
Copy link
Contributor

That's great to hear that tests pass already!

No, partest doesn't bootstrap compiler. The way to test bootstrapping is to change default classpath implementation to flat and then run ant all.clean && ant. If compiler builds that way (using new classpath), then you are good to go.

@mpociecha
Copy link
Contributor

I ran this with this additional "&& ant" and it works as always (BUILD SUCCESSFUL).

Edit: Ah, it was just about running full clean and build after that. Yes, I was doing that many times and even made some changes to make it working.

@gkossakowski
Copy link
Contributor

Excellent news!

I think the next step would be cleaning up the PR as mentioned earlier.

@som-snytt
Copy link
Contributor

@mpociecha Does "works as always" mean the tests you said failed actually passed?

I'm asking because those tests should work on Windows.

@mpociecha
Copy link
Contributor

@som-snytt:
I would have to check it but it seems to me that ant all.clean && ant didn't run tests. I meant that the build works.
But in general I tested these tests on two computers and there were problems on both.

Edit: And I'm not saying that problem is related to Windows for sure - just wanted to mark that I use Windows (different OS than probably most of you use).

Edit2: https://groups.google.com/forum/#!topic/scala-internals/P7mv3wmTSfs

@gkossakowski
Copy link
Contributor

Closing this PR. A new iteration of it is coming shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants