LUCENE-8599: Use sparse bitset to store docs in SingleValueDocValuesFieldUpdates#522
Merged
asfgit merged 1 commit intoapache:masterfrom Dec 10, 2018
Merged
Conversation
jpountz
approved these changes
Dec 10, 2018
Contributor
jpountz
left a comment
There was a problem hiding this comment.
We could even use DocIdSetBuilder to records docs in this case, which should be more memory-efficient and maybe also faster.
Contributor
|
For the record, DocIdSetBuilder needs to sort as well if the number of collected docs is low, but it does it with a pretty fast radix sort. |
6d8a814 to
36ad246
Compare
Member
Author
|
@jpountz I will get this in as is and then explore what we can do as a followup with DocIdSetBuilder. We do need additional stats to make efficient use of it IMO and this already yields a significant improvement. |
Contributor
|
Sure! |
…ieldUpdates Using a sparse bitset in SingleValueDocValuesFieldUdpates allows storing which documents have an update much more efficient and prevents the need to sort the docs array altogether that showed to be a significant bottleneck in LUCENE-8598. Using the spares bitset yields another 10x performance improvement in applying updates versus the changes proposed in LUCENE-8598.
36ad246 to
ef61b54
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Using a sparse bitset in SingleValueDocValuesFieldUdpates allows storing
which documents have an update much more efficient and prevents the need
to sort the docs array altogether that showed to be a significant bottleneck
in LUCENE-8598. Using the spares bitset yields another 10x performance improvement
in applying updates versus the changes proposed in LUCENE-8598.