WIP New classpath and sourcepath implementation#3962
WIP New classpath and sourcepath implementation#3962afitz wants to merge 62 commits intoscala:2.11.xfrom
Conversation
If you look at the implementation of that method and its usage its clear that it should have been named `findClassFile` from the beginning because that's what it does: find a class file and not a source file.
Introduced ClassfileLookup interface which allows us to remove dependency on particular classpath implementation for class file lookups.
Since .NET backend got removed this method is a no-op.
We need to use full name of the package if we use flat classpath.
It was removed temporarily to make sure that flatClasspath is being used. The better way is to have implementation for both and guard them with runtime assertion. Also, switched to using flat classpath by default.
So it's not recomputed on every access.
It's not being used yet because we have to hook into PathResolver and construct those objects for each classpath element. Also, this is just to get us started with the whole idea of replacing classpath implementation. I expect we'll get specialized versions for JarFlatClasspath and DirectoryFlatClasspath that are optimized for the fact that we are dealing with a jar file or a directory.
Extracted methods responsible for classpath element creation into separate trait that can be implemented to create either recursive (old classpath) or flat (new classpath) elements.
We don't need path-dependent entries. Actually, we want to aggregate entries from several classpath elements.
By overriding a method and setting its type to Nothing we make it impossible to override.
Implement various lookup methods in ZipArchive. Most of the logic
was already there so I just had to put things together.
One noticeable change is that directory entry is put under
s"{dirName}/" name instead of just dirName. That's required by
lookup methods and it was probably an omission.
So they get good-looking toString representation.
Bugs are:
* filter out directories which are not valid package names
* filter out files which are not valid class files
* entries shouldn't be inner classes of AbstractFileFlatClasspath
class because that messes up their equality (due to path-dependence)
* not every classpath element contains given package hence
getDirectory returns an Option[AbstractFile]
Implement AggragetFlatClasspath which aggregates a sequence of other FlatClasspaths. It's vastly inefficient but that's ok for now.
Before, we used WrappingFlatClasspath which would simply wrap an old implementation of a classpath and expose it through FlatClasspath interface. In order to do that, we had to implement an alternative PathResolver which resolves classpath elements to FlatClasspath implementations.
Also, we traverse underlaying zip file only once and keep the index (entries) in memory.
This way we can have jars (instead of class dirs) on a classpath for SymbolTable tests.
This way we can compare the old and the new classpath implementation without distorting the results by relying on caching in Java's ZipFileIndex.
It turned out that going through Path is expensive because it creates a new java.io.File and makes filesystem calls.
Moved that implementation so it's not needed in any object.
Add FlatClasspath implementation that delegates (a little bit poorly)
to javac's ZipFileIndex. That implementation of zip file index is
really fast due two reasons:
* inherently it's written for performance, it populates the index
in single, focused pass
* it caches the index per jar file, so the next time we ask for
the same jar we don't need to rescan it
It's a very poor method of measuring performance but it gives some rough idea.
…hing, rename Classpaths and Classfiles
By the way code related to hacks has been commented out to test whether it's really and still needed.
…ds like "scala foo.MyMain" Added some mock, temporary things like source path
- scalap can work also using flat classpath - Added simple apps to examine classpath implementations - Reorganised entries
It works partially. In general there are problems with jar files (for example underlying file of jar loaded as ManifestResources is null so we gen NullPointerExceptions). Many tests in partest fail because of these problems. There are still some test things and a lot of TODOs.
…y containing both There are still problems with zip/jar abstract files without real underlying file (it has to be implemented). Also manual test commited lately still doesn't work for flat classpath due to problems with BooleanUtils.java from jar with sources of apache commons (symbol not found), although packages and entries processed by symbol loader are exactly the same as in the case of old classpath implementation.
There was really stupid bug - we always was deleting last 6 chars from file name instead of real extension
Also set caching as a default behaviour.
|
Regarding conflicts - I wasn't rebasing this branch for some time. I'm going to rebase it in the case of normal (not WIP) pull request. Edit: Or, looking at all these messages from scala-jenkins, before next update. |
|
I'll have a look at this next week. |
|
For future reference, it would be better to perform this review in a different repository until you are ready to submit the branch for testing. This PR took a big chunk of our build resources for a few hours as it has a large number of commits. That said, I'm excited to see this work resuming! |
|
Overall is going in the right direction. We definitely need to squash a lot of commits from this branch. The 4cdec6c and 2734bd7 are independent from the rest of this PR so it can be cherry picked and submitted as separate PR. Could you do that? Please rebase the PR on top of latest 2.11.x branch and squash many of the commits that make sense to be squashed. You don't need to achieve perfect commits in one step, but squashing the obvious candidates (those that fix up some other commits) is a good start. Lastly, what's the status of partest tests? How many of them are passing and how many of them are failing when flat classpath is enabled by default? Can you bootstrap the compiler with new classpath implementation enabled? |
|
As I wrote via mail some time ago: All tests from partest pass except these ones: java 7 which are not related to my changes. They were failing on my machine with Windows even on master. And I have commit which is alread merged to 2.11.x and everything passed on Jenkins while I also had problems with above tests on my machine when running tests before creating this previous, already merged PR. "Can you bootstrap the compiler with new classpath implementation enabled?" |
|
That's great to hear that tests pass already! No, partest doesn't bootstrap compiler. The way to test bootstrapping is to change default classpath implementation to flat and then run |
|
I ran this with this additional "&& ant" and it works as always (BUILD SUCCESSFUL). Edit: Ah, it was just about running full clean and build after that. Yes, I was doing that many times and even made some changes to make it working. |
|
Excellent news! I think the next step would be cleaning up the PR as mentioned earlier. |
|
@mpociecha Does "works as always" mean the tests you said failed actually passed? I'm asking because those tests should work on Windows. |
|
@som-snytt: Edit: And I'm not saying that problem is related to Windows for sure - just wanted to mark that I use Windows (different OS than probably most of you use). Edit2: https://groups.google.com/forum/#!topic/scala-internals/P7mv3wmTSfs |
|
Closing this PR. A new iteration of it is coming shortly. |
This is WIP branch pushed to consult it with Grzegorz Kossakowski.
Current implementation of classpath is not very efficient and it's quite abstract because it was created when e.g. .NET was supported. Therefore one of goals of new implementation is to improve efficiency by creation of dedicated classes for various file types characteristic for Scala with JVM backend - what allows us to create more efficient, more low-level operations. The second goal is to reduce memory consumption. To do that there's added caching for jar/zip files. So that we can reuse existing instances, when we create entries for the same files once again. Hopefully it will lead to significant reduction of used memory also in the case of e.g. ScalaPresentationCompilers.
One of the most important things took into account during creating this implementation was adding it as optional feature, which can be marked as experimental and turned on using flag. At the beginning users will use old classpath implementation by default and they will be able to use new implementation when:
There are a lot of TODOs, some unused classes, temporary (?) benchmarks and maybe not very pretty solutions and names, but it's something, what will be discussed with Grzegorz. Definitely we need more unit tests. Right now existing tests (including partest) and also other, manually executed tests pass.
THIS PROGRAM IS SUBJECT TO THE TERMS OF THE BSD 3-CLAUSE LICENSE.
IN ADDITION, THE FOLLOWING DISCLAIMER APPLIES IN CONNECTION WITH THIS PROGRAM:
THIS SOFTWARE IS LICENSED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OF NON-INFRINGEMENT, ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THIS SOFTWARE MAY BE REDISTRIBUTED TO OTHERS ONLY BY EFFECTIVELY USING THIS OR
ANOTHER EQUIVALENT DISCLAIMER IN ADDITION TO ANY OTHER REQUIRED LICENSE TERMS.
THE FOLLOWING DISCLAIMER APPLIES TO ALL SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE:
THIS SOFTWARE IS LICENSED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OF NON-INFRINGEMENT, ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THIS SOFTWARE MAY BE REDISTRIBUTED TO OTHERS ONLY BY EFFECTIVELY USING THIS OR
ANOTHER EQUIVALENT DISCLAIMER IN ADDITION TO ANY OTHER REQUIRED LICENSE TERMS.
ONLY THE SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE, IF ANY, THAT ARE ATTACHED TO (OR OTHERWISE ACCOMPANY) THIS SUBMISSION (AND ORDINARY
COURSE CONTRIBUTIONS OF FUTURES PATCHES THERETO) ARE TO BE CONSIDERED A CONTRIBUTION. NO OTHER SOFTWARE CODE OR MATERIALS ARE A CONTRIBUTION.