Reduce overhead of disabling scoring on `BooleanScorer`. by jpountz · Pull Request #12475 · apache/lucene

jpountz · 2023-07-31T05:53:32Z

This is a subset of #12415, which I'm extracting to its own pull request in order to have separate data points in nightly benchmarks.

Results on wikimedium10m and wikinightly counting tasks:

                       CountTerm     4624.91      (6.4%)     4581.34      (6.4%)   -0.9% ( -12% -   12%) 0.640
                 CountAndHighMed      280.03      (4.5%)      280.15      (4.4%)    0.0% (  -8% -    9%) 0.974
                     CountPhrase        7.22      (3.0%)        7.24      (1.8%)    0.3% (  -4% -    5%) 0.728
                CountAndHighHigh       52.84      (4.9%)       53.12      (5.6%)    0.5% (  -9% -   11%) 0.755
                        PKLookup      232.01      (3.6%)      235.45      (2.8%)    1.5% (  -4% -    8%) 0.144
                 CountOrHighHigh       42.37      (6.1%)       56.04      (9.1%)   32.3% (  16% -   50%) 0.000
                  CountOrHighMed       30.56      (6.5%)       40.46      (9.8%)   32.4% (  15% -   52%) 0.000

This is a subset of apache#12415, which I'm extracting to its own pull request in order to have separate data points in nightly benchmarks. Results on `wikimedium10m` and `wikinightly` counting tasks: ``` CountTerm 4624.91 (6.4%) 4581.34 (6.4%) -0.9% ( -12% - 12%) 0.640 CountAndHighMed 280.03 (4.5%) 280.15 (4.4%) 0.0% ( -8% - 9%) 0.974 CountPhrase 7.22 (3.0%) 7.24 (1.8%) 0.3% ( -4% - 5%) 0.728 CountAndHighHigh 52.84 (4.9%) 53.12 (5.6%) 0.5% ( -9% - 11%) 0.755 PKLookup 232.01 (3.6%) 235.45 (2.8%) 1.5% ( -4% - 8%) 0.144 CountOrHighHigh 42.37 (6.1%) 56.04 (9.1%) 32.3% ( 16% - 50%) 0.000 CountOrHighMed 30.56 (6.5%) 40.46 (9.8%) 32.4% ( 15% - 52%) 0.000 ```

jpountz · 2023-07-31T06:10:58Z

The failure is suspicious, I'll look into it.

jpountz · 2023-08-01T08:25:10Z

It is an unrelated but real bug. BooleanScorer sometimes forwards to an inner bulk scorer directly when a single one matches on a range. This may cause the collector's competitive iterator to be advanced to a document that is outside of the scored range (which feels like it is the root cause of the issue) and greater than a match of another clause of the disjunction.

jpountz · 2023-08-01T11:44:46Z

Opened #12481.

This is a subset of #12415, which I'm extracting to its own pull request in order to have separate data points in nightly benchmarks. Results on `wikimedium10m` and `wikinightly` counting tasks: ``` CountTerm 4624.91 (6.4%) 4581.34 (6.4%) -0.9% ( -12% - 12%) 0.640 CountAndHighMed 280.03 (4.5%) 280.15 (4.4%) 0.0% ( -8% - 9%) 0.974 CountPhrase 7.22 (3.0%) 7.24 (1.8%) 0.3% ( -4% - 5%) 0.728 CountAndHighHigh 52.84 (4.9%) 53.12 (5.6%) 0.5% ( -9% - 11%) 0.755 PKLookup 232.01 (3.6%) 235.45 (2.8%) 1.5% ( -4% - 8%) 0.144 CountOrHighHigh 42.37 (6.1%) 56.04 (9.1%) 32.3% ( 16% - 50%) 0.000 CountOrHighMed 30.56 (6.5%) 40.46 (9.8%) 32.4% ( 15% - 52%) 0.000 ```

jpountz added this to the 9.8.0 milestone Jul 31, 2023

jpountz merged commit acffcfa into apache:main Aug 3, 2023

jpountz deleted the reduced_no_scoring_overhead_booleanscorer branch August 3, 2023 05:17

This was referenced Aug 4, 2023

Optimize disjunction counts. #12415

Merged

Stop aligning windows in BooleanScorer. #12488

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce overhead of disabling scoring on `BooleanScorer`.#12475

Reduce overhead of disabling scoring on `BooleanScorer`.#12475
jpountz merged 1 commit intoapache:mainfrom
jpountz:reduced_no_scoring_overhead_booleanscorer

jpountz commented Jul 31, 2023

Uh oh!

jpountz commented Jul 31, 2023

Uh oh!

jpountz commented Aug 1, 2023

Uh oh!

jpountz commented Aug 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jpountz commented Jul 31, 2023

Uh oh!

jpountz commented Jul 31, 2023

Uh oh!

jpountz commented Aug 1, 2023

Uh oh!

jpountz commented Aug 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant