Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32) by uschindler · Pull Request #173 · apache/lucene

uschindler · 2021-06-07T12:06:28Z

INFO: This is a clone/update of apache/lucene-solr#2176 (for more detailed discussion see this old PR from the Lucene/Solr combined repository)

This is just a draft PR for a first insight on memory mapping improvements in JDK 16+.

Some background information: Starting with JDK-14, there is a new incubating module "jdk.incubator.foreign" that has a new, not yet stable API for accessing off-heap memory (and later it will also support calling functions using classical MethodHandles that are located in libraries like .so or .dll files). This incubator module has several versions:

first version: https://openjdk.java.net/jeps/370 (slow, very buggy and thread confinement, so making it unuseable with Lucene)
second version: https://openjdk.java.net/jeps/383 (still thread confinement, but now allows transfer of "ownership" to other threads; this is still impossible to use with Lucene.
third version in JDK 16: https://openjdk.java.net/jeps/393 (this version has included "Support for shared segments"). This now allows us to safely use the same external mmaped memory from different threads and also unmap it!

This module more or less overcomes several problems:

ByteBuffer API is limited to 32bit (in fact MMapDirectory has to chunk in 1 GiB portions)
There is no official way to unmap ByteBuffers when the file is no longer used. There is a way to use sun.misc.Unsafe and forcefully unmap segments, but any IndexInput accessing the file from another thread will crush the JVM with SIGSEGV or SIGBUS. We learned to live with that and we happily apply the unsafe unmapping, but that's the main issue.

@uschindler had many discussions with the team at OpenJDK and finally with the third incubator, we have an API that works with Lucene. It was very fruitful discussions (thanks to @mcimadamore !)

With the third incubator we are now finally able to do some tests (especially performance). As this is an incubating module, this PR first changes a bit the build system:

disable -Werror for :lucene:core
add the incubating module to compiler of :lucene:core and enable it for all test builds. This is important, as you have to pass --add-modules jdk.incubator.foreign also at runtime!

The code basically just modifies MMapDirectory to use LONG instead of INT for the chunk size parameter. In addition it adds MemorySegmentIndexInput that is a copy of our ByteBufferIndexInput (still there, but unused), but using MemorySegment instead of ByteBuffer behind the scenes. It works in exactly the same way, just the try/catch blocks for supporting EOFException or moving to another segment were rewritten.

The openInput code uses MemorySegment.mapFile() to get a memory mapping. This method is unfortunately a bit buggy in JDK-16-ea-b30, so I added some workarounds. See JDK issues: https://bugs.openjdk.java.net/browse/JDK-8259027, https://bugs.openjdk.java.net/browse/JDK-8259028, https://bugs.openjdk.java.net/browse/JDK-8259032, https://bugs.openjdk.java.net/browse/JDK-8259034. The bugs with alignment and zero byte mmaps are fixed in b32, this PR was adapted (hacks removed).

It passes all tests and it looks like you can use it to read indexes. The default chunk size is now 16 GiB (but you can raise or lower it as you like; tests are doing this). Of course you can set it to Long.MAX_VALUE, in that case every index file is always mapped to one big memory mapping. My testing with Windows 10 have shown, that this is not a good idea!!!. Huge mappings fragment address space over time and as we can only use like 43 or 46 bits (depending on OS), the fragmentation will at some point kill you. So 16 GiB looks like a good compromise: Most files will be smaller than 6 GiB anyways (unless you optimize your index to one huge segment). So for most Lucene installations, the number of segments will equal the number of open files, so Elasticsearch huge user consumers will be very happy. The sysctl max_map_count may not need to be touched anymore.

In addition, this implements readLongs in a better way than @jpountz did (no caching or arbitrary objects). Nevertheless, as the new MemorySegment API relies on final, unmodifiable classes and coping memory from a MemorySegment to a on-heap Java array, it requires us to wrap all those arrays using a MemorySegment each time (e.g. in readBytes() or readLongs), there may be some overhead du to short living object allocations (those are NOT reuseable!!!). In short: In future we should throw away on coping/loading our stuff to heap and maybe throw away IndexInput completely and base our code fully on random access. The new foreign-vector APIs will in future also be written with MemorySegment in its focus. So you can allocate a vector view on a MemorySegment and let the vectorizer fully work outside java heap inside our mmapped files! :-)

It would be good if you could checkout this branch and try it in production.

But be aware:

You need JDK 11 to run Gradle (set JAVA_HOME to it)
You need JDK 16-ea-b32 (set RUNTIME_JAVA_HOME to it)
The lucene-core.jar will be JDK16 class files and requires JDK-16 to execute.
Also you need to add --add-modules jdk.incubator.foreign to the command line of your Java program/Solr server/Elasticsearch server

It would be good to get some benchmarks, especially by @rmuir or @mikemccand. Take your time and enjoy the complexity of setting this up! ;-)

My plan is the following:

report any bugs or slowness, especially with Hotspot optimizations. The last time I talked to Maurizio, he taked about Hotspot not being able to fully optimize for-loops with long instead of int, so it may take some time until the full performance is there.
wait until the final version of project PANAMA-foreign goes into Java's Core Library (no module needed anymore)
add a MR-JAR for lucene-core.jar and compile the MemorySegmentIndexInput and maybe some helper classes with JDK 17/18/19 (hopefully?).

In addition there are some comments in the code talking about safety (e.g., we need IOUtils.close() taking AutoCloseable instead of just Closeable, so we can also enfoce that all memory segments are closed after usage. In addition, by default all VarHandles are aligned. By default it refuses to read a LONG from an address which is not a multiple of 8. I had to disable this feature, as all our index files are heavily unaliged. We should in meantime not only convert our files to little endian, but also make all non-compressed types (like long[] arrays or non-encoded integers be aligned to the correct boundaries in files). The most horrible thing I have seen is that our CFS file format starts the "inner" files totally unaligned. We should fix the CFSWriter to start new files always at multiples of 8 bytes. I will open an issue about this.

…d from ANT build)

…s occur! Remove useless slicing if aligned.

…ning "buffer" to "segment"; also make the segments array final (curSegment == null when closed)

…eException: Cannot close while another thread is accessing the segment"

…ng objects to extend their functionality (like asserting in tests)

…eap segments don't need this)

… length mappings and offsets

… can correctly throw AlreadyClosedEx; TODO: add a test

…eign-mmap

uschindler · 2021-06-07T12:11:21Z

I moved this old pull request from apache/lucene-solr#2176 to the Lucene repository:

Removed the changes in Solr
Updated to Little Endian (see LUCENE-9047: Move the Directory APIs to be little endian (take 2) #107, LUCENE-9047)

All tests still pass, the new policeman Jenkins job is: https://jenkins.thetaphi.de/view/Lucene/job/Lucene-jdk16panama-Linux/ (Linux), https://jenkins.thetaphi.de/view/Lucene/job/Lucene-jdk16panama-Windows/ (Windows)

uschindler · 2021-06-08T11:27:03Z

The JDK 17 version is now here: #177

MarcusSorealheis · 2021-06-10T21:21:01Z

I know this is early days but want to clarify:

Is the plan to work on this and #177 in parallel until we know which is the more sustainable option, or abandon this one altogether with expectations that JDK 17 will be better? I'm going to go through the pain of setting one up but probably cannot do both.

uschindler · 2021-06-11T14:15:39Z

Is the plan to work on this and #177 in parallel until we know which is the more sustainable option, or abandon this one altogether with expectations that JDK 17 will be better? I'm going to go through the pain of setting one up but probably cannot do both.

I won't expect any of both to be in a stable release of Lucene yet. The version for 16 (this one) is here for reference only. It has performance problems, because JDK 16 was not able to optimize loops with 64 bit. With JDK 17, this should be better, but I wasn't able to test this with #177. So I'd recommend to do performance tests with #177.

Hopefully at some point this will land in JDK so we can officially use it. The problem is currently, every major release changes API in significant ways, so there is no way to include it in official builds. It also won't work without command line settings, so if you'd like to use with software like Solr or Elasticsearch, you would need to change startup scripts, too. MR-JARs don't help, because MR-JARs resolve class files based on minimal version and you can't make it "only use this class for JDK 16".

A plan might be (as this is quite isolated) to create a separate github project, with just the directory implementation, so it can be downloaded as separate JAR file and included into projects. Possibly with a DirectoryFactory for Solr or similar plugin for Elasticsearch. My time is a bit limited at moment, but that's obviously the best way to go. The setup as draft pull request with hacked code inside Lucene was mainly done to run all Lucene tests easily against it and compare performance with old MMapDirectory.

…o draft/jdk-foreign-mmap

# Conflicts: # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java

xiaoshi2013 · 2021-09-08T16:00:26Z

so cool!

…o draft/jdk-foreign-mmap # Conflicts: # gradle/testing/defaults-tests.gradle # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java # lucene/core/src/java/org/apache/lucene/util/Unwrappable.java

…o draft/jdk-foreign-mmap

# Conflicts: # gradle/java/javac.gradle

…r/lucene into draft/jdk-foreign-mmap

…o draft/jdk-foreign-mmap # Conflicts: # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java # lucene/core/src/test/org/apache/lucene/store/TestMmapDirectory.java

# Conflicts: # lucene/core/src/java/module-info.java # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java

uschindler · 2022-02-14T17:15:40Z

This PR is no longer maintained!

uschindler added 23 commits January 2, 2021 14:25

Initial state of new jdk-foreign MMAP API

190a853

Workaround to prevent incorrect test files from being executed (copie…

00d01a7

…d from ANT build)

Fix the remaining TODOs: make sure we unmap all segments if exception…

22c3c4b

…s occur! Remove useless slicing if aligned.

Cleanup code duplication mess exception handling and rename all remai…

f9ca335

…ning "buffer" to "segment"; also make the segments array final (curSegment == null when closed)

add missing ensureOpen() as NPE can't happen here

1a8a354

Cleanup messy duplicate methods

27fce4f

Add workaround for JDK-8259028

efcfccc

Make the JVM crush detector ready for heavy prime time!

8ee976a

Remove incorrect assert (won't work if page size is used like on linux)

fed48bd

Apply @dweiss improvement

50d9300

Merge branch 'master' into draft/jdk-foreign-mmap

8dd5d90

Add readLEFloats() introduced by LUCENE-9652 / apache#2175

0245d3f

Improve test to allow the following exception: "java.lang.IllegalStat…

ea188c1

…eException: Cannot close while another thread is accessing the segment"

Add a new interface to Lucene's core to mark classes which are wrappi…

01aca07

…ng objects to extend their functionality (like asserting in tests)

Split and rewrite getBytes() and remove useless try-with-resources (h…

60200e8

…eap segments don't need this)

Add static final boolean IS_LITTLE_ENDIAN and cleanup if statements

ba61072

Merge branch 'master' into draft/jdk-foreign-mmap

ec304ab

Remove hacks: JDK-16 EA b32 has now fixed the horrible bugs with zero…

5edcdf4

… length mappings and offsets

Merge branch 'master' into draft/jdk-foreign-mmap

b0eec7a

Improve close method to also null out the segments, so positional API…

7a3cf53

… can correctly throw AlreadyClosedEx; TODO: add a test

Merge branch 'master' into draft/jdk-foreign-mmap

d2c0be5

Merge branch 'draft/jdk-foreign-mmap' of ../lusolr into draft/jdk-for…

5542f2c

…eign-mmap

Update to little endian (LUCENE-9047)

e76bdc0

uschindler self-assigned this Jun 7, 2021

uschindler marked this pull request as draft June 7, 2021 12:06

uschindler added the enhancement label Jun 7, 2021

uschindler requested review from dweiss, jpountz and msokolov June 7, 2021 12:12

uschindler mentioned this pull request Jun 8, 2021

Initial rewrite of MMapDirectory for JDK-17 preview (incubating) Panama APIs (>= JDK-17-ea-b25) #177

Closed

Disable test which does not work without extra JVM options

0764f4c

uschindler added 2 commits June 14, 2021 11:47

Merge branch 'main' of https://gitbox.apache.org/repos/asf/lucene int…

dfad2a3

…o draft/jdk-foreign-mmap

Rename typo in Unwrappable

7a9a536

# Conflicts: # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java

uschindler closed this Aug 6, 2021

uschindler deleted the draft/jdk-foreign-mmap branch August 6, 2021 16:43

uschindler restored the draft/jdk-foreign-mmap branch August 6, 2021 16:45

uschindler reopened this Aug 6, 2021

uschindler mentioned this pull request Oct 8, 2021

LUCENE-10158: Add a new interface Unwrappable to the utils package to ease migration to new MMAPDirectory and its testing #369

Merged

uschindler and others added 3 commits October 11, 2021 00:33

Merge branch 'main' of https://gitbox.apache.org/repos/asf/lucene int…

523d0c1

…o draft/jdk-foreign-mmap # Conflicts: # gradle/testing/defaults-tests.gradle # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java # lucene/core/src/java/org/apache/lucene/util/Unwrappable.java

Merge branch 'apache:main' into draft/jdk-foreign-mmap

949e720

Merge branch 'apache:main' into draft/jdk-foreign-mmap

b8ed0a5

uschindler mentioned this pull request Dec 5, 2021

Initial rewrite of MMapDirectory for JDK-18 preview (incubating) Panama APIs (>= JDK-18-ea-b26) #518

Closed

uschindler and others added 7 commits December 20, 2021 13:19

Merge branch 'main' of https://gitbox.apache.org/repos/asf/lucene int…

bb77bab

…o draft/jdk-foreign-mmap

Update to module system branch

8476a52

# Conflicts: # gradle/java/javac.gradle

Remove unneeded module

11cf45c

Merge branch 'draft/jdk-foreign-mmap' of https://github.com/uschindle…

d679759

…r/lucene into draft/jdk-foreign-mmap

Merge branch 'main' of https://gitbox.apache.org/repos/asf/lucene int…

c5aea7c

…o draft/jdk-foreign-mmap # Conflicts: # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java # lucene/core/src/test/org/apache/lucene/store/TestMmapDirectory.java

fix formatting

ca20fff

Fix test after modularization

b63001e

uschindler force-pushed the draft/jdk-foreign-mmap branch from a795bc1 to b63001e Compare December 22, 2021 08:49

Merge branch 'main' into draft/jdk-foreign-mmap

c5a028a

# Conflicts: # lucene/core/src/java/module-info.java # lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java

uschindler closed this Feb 14, 2022

This was referenced May 20, 2022

Outdated: Initial rewrite of MMapDirectory for JDK-19 preview Panama APIs (>= JDK-19-ea+23) #911

Closed

MR-JAR rewrite of MMapDirectory with JDK-19 preview Panama APIs (>= JDK-19-ea+23) #912

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32)#173

Initial rewrite of MMapDirectory for JDK-16 preview (incubating) Panama APIs (>= JDK-16-ea-b32)#173
uschindler wants to merge 38 commits intoapache:mainfrom
uschindler:draft/jdk-foreign-mmap

uschindler commented Jun 7, 2021 •

edited

Loading

Uh oh!

uschindler commented Jun 7, 2021

Uh oh!

uschindler commented Jun 8, 2021

Uh oh!

MarcusSorealheis commented Jun 10, 2021

Uh oh!

uschindler commented Jun 11, 2021

Uh oh!

xiaoshi2013 commented Sep 8, 2021

Uh oh!

uschindler commented Feb 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

uschindler commented Jun 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

uschindler commented Jun 7, 2021

Uh oh!

uschindler commented Jun 8, 2021

Uh oh!

MarcusSorealheis commented Jun 10, 2021

Uh oh!

uschindler commented Jun 11, 2021

Uh oh!

xiaoshi2013 commented Sep 8, 2021

Uh oh!

uschindler commented Feb 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

uschindler commented Jun 7, 2021 •

edited

Loading