Add a preload hint for preloading mmap data on specific open calls#14604
Add a preload hint for preloading mmap data on specific open calls#14604ChrisHegarty merged 3 commits intoapache:mainfrom
Conversation
2893b46 to
60df46d
Compare
| FileTypeHint.DATA, | ||
| FileDataHint.KNN_VECTORS, | ||
| DataAccessHint.RANDOM, | ||
| PreloadHint.INSTANCE)); |
There was a problem hiding this comment.
We've not preloaded this before. But I think it is ok, since it's just a hint to the IndexInput, which by default will be be ignored - since one needs to provide a preLoadOverride to MMapDirectory.
There was a problem hiding this comment.
that's not an override, but there is a new constant for the already preexisting mechanism: mmapdir.setPreload(PRELOAD_HINT)
There was a problem hiding this comment.
@uschindler correct (my comments are out of date). Your words are better than my clumsy ones above. Thank you.
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
Outdated
Show resolved
Hide resolved
…14604) Add a PreloadHint to tell MMapDirectory to preload data, and use it to preload the HNSW index into memory.
|
Can you add a CHANGES entry for this? Similar changes have not always been well received in the past, because it increases the cost of opening a reader, even though the reader may not need the field that is pre-loaded. I personally think that we should do this, because Lucene is a search library and we want to move costs to open-time rather than run-time whenever possible, but this is a notable change in runtime behavior, so I'd like to make sure it's visible in the release notes. |
|
After comments from @ChrisHegarty, this PR does not change the default behaviour. You need to use the hint by calling |
This enables the `PreloadHint` introduced in apache#14604 on completion fields and memory terms dictionaries, which are both expected to fit in the page cache in practice. I don't have specific interest in these two file formats, I was more interested in having more than one file format that uses `PreloadHint` to make sure it's generally useful and not only to KNN vectors.
|
Ahh, I saw the constant on |
We can of course choose to change this - enable preLoad Hints by default, but I'd rather separate that out since the default behaviour is far more controversial than the hint itself. |
…es. (#14634) This enables the `PreloadHint` introduced in #14604 on completion fields and memory terms dictionaries, which are both expected to fit in the page cache in practice. I don't have specific interest in these two file formats, I was more interested in having more than one file format that uses `PreloadHint` to make sure it's generally useful and not only to KNN vectors.
…es. (#14634) This enables the `PreloadHint` introduced in #14604 on completion fields and memory terms dictionaries, which are both expected to fit in the page cache in practice. I don't have specific interest in these two file formats, I was more interested in having more than one file format that uses `PreloadHint` to make sure it's generally useful and not only to KNN vectors.
Add a
PreloadHintto tell MMapDirectory to preload data, and use it to preload the HNSW index into memory