Improve refresh speed with softdelete enable#12557
Improve refresh speed with softdelete enable#12557easyice wants to merge 5 commits intoapache:mainfrom
Conversation
|
Update: when we call Benchmark code: |
|
@jpountz Would you please take a look when you get a chance? |
8094eb0 to
1d6c7f3
Compare
1d6c7f3 to
ecdb17a
Compare
|
Update: revert changes about |
|
I understand the idea, I'm a bit less happy about special-casing the soft deletes field in our doc values file format. I don't have a better suggestion though... |
|
Thanks @jpountz , it isn't perfect, there seems to be no better way to avoid probing data and know early on that all values are the same, I am just trying to improve, but i haven't thought of a better way yet |
|
Thanks @easyice ! As we are exposing API like |
|
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution! |
|
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution! |
I found a flame graph in my production environment, the DocValuesConsumer for
___soft_deletesfield accounted for a large proportionflame (12).html.zip
The docvalue for
___soft_deletesfield is usually sparse, so it will has aIndexedDISIdata structure, theIndexedDISIprovide operation likeadvance(),index(), but for___soft_deletesfield, theindex()will not used, because the value of the filed is always 1, we can remove the calculation ofindex()for this scene.I ran a benchmark use
IndexWriter#softUpdateDocumentforIndexedDISI.Method.DENSEcase, resulted in ~16% improvement in performance.