Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32)#173
Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32)#173uschindler wants to merge 38 commits intoapache:mainfrom
Conversation
…d from ANT build)
…s occur! Remove useless slicing if aligned.
…ning "buffer" to "segment"; also make the segments array final (curSegment == null when closed)
…eException: Cannot close while another thread is accessing the segment"
…ng objects to extend their functionality (like asserting in tests)
…eap segments don't need this)
… length mappings and offsets
… can correctly throw AlreadyClosedEx; TODO: add a test
|
I moved this old pull request from apache/lucene-solr#2176 to the Lucene repository:
All tests still pass, the new policeman Jenkins job is: https://jenkins.thetaphi.de/view/Lucene/job/Lucene-jdk16panama-Linux/ (Linux), https://jenkins.thetaphi.de/view/Lucene/job/Lucene-jdk16panama-Windows/ (Windows) |
|
The JDK 17 version is now here: #177 |
|
I know this is early days but want to clarify: Is the plan to work on this and #177 in parallel until we know which is the more sustainable option, or abandon this one altogether with expectations that JDK 17 will be better? I'm going to go through the pain of setting one up but probably cannot do both. |
I won't expect any of both to be in a stable release of Lucene yet. The version for 16 (this one) is here for reference only. It has performance problems, because JDK 16 was not able to optimize loops with 64 bit. With JDK 17, this should be better, but I wasn't able to test this with #177. So I'd recommend to do performance tests with #177. Hopefully at some point this will land in JDK so we can officially use it. The problem is currently, every major release changes API in significant ways, so there is no way to include it in official builds. It also won't work without command line settings, so if you'd like to use with software like Solr or Elasticsearch, you would need to change startup scripts, too. MR-JARs don't help, because MR-JARs resolve class files based on minimal version and you can't make it "only use this class for JDK 16". A plan might be (as this is quite isolated) to create a separate github project, with just the directory implementation, so it can be downloaded as separate JAR file and included into projects. Possibly with a DirectoryFactory for Solr or similar plugin for Elasticsearch. My time is a bit limited at moment, but that's obviously the best way to go. The setup as draft pull request with hacked code inside Lucene was mainly done to run all Lucene tests easily against it and compare performance with old MMapDirectory. |
…o draft/jdk-foreign-mmap
# Conflicts: # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
|
so cool! |
…o draft/jdk-foreign-mmap # Conflicts: # gradle/testing/defaults-tests.gradle # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java # lucene/core/src/java/org/apache/lucene/util/Unwrappable.java
…o draft/jdk-foreign-mmap
# Conflicts: # gradle/java/javac.gradle
…r/lucene into draft/jdk-foreign-mmap
…o draft/jdk-foreign-mmap # Conflicts: # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java # lucene/core/src/test/org/apache/lucene/store/TestMmapDirectory.java
a795bc1 to
b63001e
Compare
# Conflicts: # lucene/core/src/java/module-info.java # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
|
This PR is no longer maintained! |
INFO: This is a clone/update of apache/lucene-solr#2176 (for more detailed discussion see this old PR from the Lucene/Solr combined repository)
This is just a draft PR for a first insight on memory mapping improvements in JDK 16+.
Some background information: Starting with JDK-14, there is a new incubating module "jdk.incubator.foreign" that has a new, not yet stable API for accessing off-heap memory (and later it will also support calling functions using classical MethodHandles that are located in libraries like .so or .dll files). This incubator module has several versions:
This module more or less overcomes several problems:
sun.misc.Unsafeand forcefully unmap segments, but any IndexInput accessing the file from another thread will crush the JVM with SIGSEGV or SIGBUS. We learned to live with that and we happily apply the unsafe unmapping, but that's the main issue.@uschindler had many discussions with the team at OpenJDK and finally with the third incubator, we have an API that works with Lucene. It was very fruitful discussions (thanks to @mcimadamore !)
With the third incubator we are now finally able to do some tests (especially performance). As this is an incubating module, this PR first changes a bit the build system:
-Werrorfor:lucene:core:lucene:coreand enable it for all test builds. This is important, as you have to pass--add-modules jdk.incubator.foreignalso at runtime!The code basically just modifies
MMapDirectoryto use LONG instead of INT for the chunk size parameter. In addition it addsMemorySegmentIndexInputthat is a copy of ourByteBufferIndexInput(still there, but unused), but using MemorySegment instead of ByteBuffer behind the scenes. It works in exactly the same way, just the try/catch blocks for supporting EOFException or moving to another segment were rewritten.The openInput code uses
MemorySegment.mapFile()to get a memory mapping. This method is unfortunately a bit buggy in JDK-16-ea-b30, so I added some workarounds. See JDK issues: https://bugs.openjdk.java.net/browse/JDK-8259027, https://bugs.openjdk.java.net/browse/JDK-8259028, https://bugs.openjdk.java.net/browse/JDK-8259032, https://bugs.openjdk.java.net/browse/JDK-8259034. The bugs with alignment and zero byte mmaps are fixed in b32, this PR was adapted (hacks removed).It passes all tests and it looks like you can use it to read indexes. The default chunk size is now 16 GiB (but you can raise or lower it as you like; tests are doing this). Of course you can set it to Long.MAX_VALUE, in that case every index file is always mapped to one big memory mapping. My testing with Windows 10 have shown, that this is not a good idea!!!. Huge mappings fragment address space over time and as we can only use like 43 or 46 bits (depending on OS), the fragmentation will at some point kill you. So 16 GiB looks like a good compromise: Most files will be smaller than 6 GiB anyways (unless you optimize your index to one huge segment). So for most Lucene installations, the number of segments will equal the number of open files, so Elasticsearch huge user consumers will be very happy. The sysctl max_map_count may not need to be touched anymore.
In addition, this implements
readLongsin a better way than @jpountz did (no caching or arbitrary objects). Nevertheless, as the new MemorySegment API relies on final, unmodifiable classes and coping memory from a MemorySegment to a on-heap Java array, it requires us to wrap all those arrays using a MemorySegment each time (e.g. inreadBytes()orreadLongs), there may be some overhead du to short living object allocations (those are NOT reuseable!!!). In short: In future we should throw away on coping/loading our stuff to heap and maybe throw away IndexInput completely and base our code fully on random access. The new foreign-vector APIs will in future also be written with MemorySegment in its focus. So you can allocate a vector view on a MemorySegment and let the vectorizer fully work outside java heap inside our mmapped files! :-)It would be good if you could checkout this branch and try it in production.
But be aware:
JAVA_HOMEto it)RUNTIME_JAVA_HOMEto it)--add-modules jdk.incubator.foreignto the command line of your Java program/Solr server/Elasticsearch serverIt would be good to get some benchmarks, especially by @rmuir or @mikemccand. Take your time and enjoy the complexity of setting this up! ;-)
My plan is the following:
In addition there are some comments in the code talking about safety (e.g., we needIn addition, by default all VarHandles are aligned. By default it refuses to read a LONG from an address which is not a multiple of 8. I had to disable this feature, as all our index files are heavily unaliged. We should in meantime not only convert our files to little endian, but also make all non-compressed types (likeIOUtils.close()takingAutoCloseableinstead of justCloseable, so we can also enfoce that all memory segments are closed after usage.long[]arrays or non-encoded integers be aligned to the correct boundaries in files). The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.