Skip to content

Bump the window size of disjunctions from 2,048 to 4,096.#13605

Merged
jpountz merged 1 commit intoapache:mainfrom
jpountz:bump_disjunction_window_size
Jul 25, 2024
Merged

Bump the window size of disjunctions from 2,048 to 4,096.#13605
jpountz merged 1 commit intoapache:mainfrom
jpountz:bump_disjunction_window_size

Conversation

@jpountz
Copy link
Copy Markdown
Contributor

@jpountz jpountz commented Jul 24, 2024

It's been pointed multiple times that a difference between Tantivy and Lucene is the fact that Tantivy uses windows of 4,096 docs when Lucene has a 2x smaller window size of 2,048 docs and that this might explain part of the performance difference. luceneutil suggests that bumping the window size to 4,096 does indeed improve performance for counting queries, but not for top-k queries. I'm still suggesting to bump the window size across the board to keep our disjunction scorers consistent.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                     CountPhrase        3.27     (11.6%)        3.14      (8.0%)   -4.1% ( -21% -   17%) 0.189
               HighTermMonthSort     3521.28      (3.5%)     3481.74      (2.8%)   -1.1% (  -7% -    5%) 0.262
                        PKLookup      289.42      (1.3%)      286.47      (2.2%)   -1.0% (  -4% -    2%) 0.075
                      TermDTSort      352.01      (6.5%)      348.89      (5.6%)   -0.9% ( -12% -   11%) 0.642
                          Phrase       11.85      (5.3%)       11.76      (5.0%)   -0.8% ( -10% -    9%) 0.634
                       OrHighLow      772.82      (2.4%)      767.24      (2.1%)   -0.7% (  -5% -    3%) 0.313
                 CountAndHighMed      120.78      (2.3%)      120.10      (2.5%)   -0.6% (  -5% -    4%) 0.449
           HighTermDayOfYearSort      821.48      (3.5%)      818.62      (2.7%)   -0.3% (  -6% -    6%) 0.724
               HighTermTitleSort      148.84      (2.9%)      148.33      (2.8%)   -0.3% (  -5% -    5%) 0.700
                     AndHighHigh       62.36      (1.7%)       62.17      (1.8%)   -0.3% (  -3% -    3%) 0.584
                CountAndHighHigh       41.41      (2.5%)       41.34      (2.6%)   -0.2% (  -5% -    5%) 0.836
                          Fuzzy1       96.24      (1.0%)       96.09      (1.2%)   -0.2% (  -2% -    2%) 0.667
                      AndHighLow      827.59      (2.7%)      826.89      (2.4%)   -0.1% (  -5% -    5%) 0.918
                      AndHighMed       93.35      (1.6%)       93.29      (1.7%)   -0.1% (  -3% -    3%) 0.903
            HighTermTitleBDVSort       16.30      (4.2%)       16.29      (6.7%)   -0.0% ( -10% -   11%) 0.984
                       OrHighMed      153.42      (2.6%)      153.41      (2.2%)   -0.0% (  -4% -    4%) 0.994
                         Respell       46.72      (1.3%)       46.72      (1.4%)    0.0% (  -2% -    2%) 0.975
                       And3Terms      155.73      (2.2%)      155.95      (1.4%)    0.1% (  -3% -    3%) 0.805
                          Fuzzy2       58.66      (0.9%)       58.77      (1.1%)    0.2% (  -1% -    2%) 0.566
                      OrHighHigh       75.70      (2.6%)       75.90      (2.3%)    0.3% (  -4% -    5%) 0.733
                       CountTerm     9110.00      (4.3%)     9142.10      (3.2%)    0.4% (  -6% -    8%) 0.768
                    AndStopWords       29.47      (2.6%)       29.57      (1.3%)    0.4% (  -3% -    4%) 0.579
             And2Terms2StopWords      150.30      (2.1%)      150.86      (1.1%)    0.4% (  -2% -    3%) 0.487
                      OrHighRare      237.33      (5.7%)      238.26      (6.2%)    0.4% ( -10% -   13%) 0.837
                         MedTerm      553.55      (6.0%)      555.97      (7.7%)    0.4% ( -12% -   15%) 0.841
                        Wildcard       34.08      (3.2%)       34.25      (3.4%)    0.5% (  -5% -    7%) 0.630
                    OrNotHighLow      761.70      (3.2%)      766.33      (2.6%)    0.6% (  -5% -    6%) 0.511
              Or2Terms2StopWords      156.10      (3.2%)      157.14      (1.8%)    0.7% (  -4% -    5%) 0.416
                        Or3Terms      156.59      (3.0%)      157.70      (1.9%)    0.7% (  -4% -    5%) 0.374
                        HighTerm      440.27      (5.6%)      443.89      (7.5%)    0.8% ( -11% -   14%) 0.695
                         LowTerm      892.27      (5.2%)      900.48      (6.8%)    0.9% ( -10% -   13%) 0.632
                     OrStopWords       31.88      (4.7%)       32.29      (2.6%)    1.3% (  -5% -    9%) 0.276
                         Prefix3      214.22      (3.4%)      217.48      (2.8%)    1.5% (  -4% -    8%) 0.124
                   OrHighNotHigh      247.52      (4.8%)      254.52      (5.1%)    2.8% (  -6% -   13%) 0.071
                          IntNRQ      144.53     (17.2%)      148.66     (17.9%)    2.9% ( -27% -   45%) 0.607
                    OrNotHighMed      330.23      (6.5%)      340.12      (5.4%)    3.0% (  -8% -   15%) 0.114
                    OrHighNotMed      285.11      (5.2%)      293.82      (6.2%)    3.1% (  -7% -   15%) 0.092
                    OrHighNotLow      429.94      (5.4%)      443.15      (6.8%)    3.1% (  -8% -   16%) 0.113
                   OrNotHighHigh      189.30      (5.9%)      195.25      (5.4%)    3.1% (  -7% -   15%) 0.079
                  CountOrHighMed       99.90     (22.5%)      121.78     (20.0%)   21.9% ( -16% -   83%) 0.001
                 CountOrHighHigh       53.76     (35.1%)       70.24     (32.5%)   30.6% ( -27% -  151%) 0.004

Description

It's been pointed multiple times that a difference between Tantivy and Lucene
is the fact that Tantivy uses windows of 4,096 docs when Lucene has a 2x
smaller window size of 2,048 docs and that this might explain part of the
performance difference. luceneutil suggests that bumping the window size to
4,096 does indeed improve performance for counting queries, but not for top-k
queries. I'm still suggesting to bump the window size across the board to keep
our disjunction scorer consistent.

```
                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                     CountPhrase        3.27     (11.6%)        3.14      (8.0%)   -4.1% ( -21% -   17%) 0.189
               HighTermMonthSort     3521.28      (3.5%)     3481.74      (2.8%)   -1.1% (  -7% -    5%) 0.262
                        PKLookup      289.42      (1.3%)      286.47      (2.2%)   -1.0% (  -4% -    2%) 0.075
                      TermDTSort      352.01      (6.5%)      348.89      (5.6%)   -0.9% ( -12% -   11%) 0.642
                          Phrase       11.85      (5.3%)       11.76      (5.0%)   -0.8% ( -10% -    9%) 0.634
                       OrHighLow      772.82      (2.4%)      767.24      (2.1%)   -0.7% (  -5% -    3%) 0.313
                 CountAndHighMed      120.78      (2.3%)      120.10      (2.5%)   -0.6% (  -5% -    4%) 0.449
           HighTermDayOfYearSort      821.48      (3.5%)      818.62      (2.7%)   -0.3% (  -6% -    6%) 0.724
               HighTermTitleSort      148.84      (2.9%)      148.33      (2.8%)   -0.3% (  -5% -    5%) 0.700
                     AndHighHigh       62.36      (1.7%)       62.17      (1.8%)   -0.3% (  -3% -    3%) 0.584
                CountAndHighHigh       41.41      (2.5%)       41.34      (2.6%)   -0.2% (  -5% -    5%) 0.836
                          Fuzzy1       96.24      (1.0%)       96.09      (1.2%)   -0.2% (  -2% -    2%) 0.667
                      AndHighLow      827.59      (2.7%)      826.89      (2.4%)   -0.1% (  -5% -    5%) 0.918
                      AndHighMed       93.35      (1.6%)       93.29      (1.7%)   -0.1% (  -3% -    3%) 0.903
            HighTermTitleBDVSort       16.30      (4.2%)       16.29      (6.7%)   -0.0% ( -10% -   11%) 0.984
                       OrHighMed      153.42      (2.6%)      153.41      (2.2%)   -0.0% (  -4% -    4%) 0.994
                         Respell       46.72      (1.3%)       46.72      (1.4%)    0.0% (  -2% -    2%) 0.975
                       And3Terms      155.73      (2.2%)      155.95      (1.4%)    0.1% (  -3% -    3%) 0.805
                          Fuzzy2       58.66      (0.9%)       58.77      (1.1%)    0.2% (  -1% -    2%) 0.566
                      OrHighHigh       75.70      (2.6%)       75.90      (2.3%)    0.3% (  -4% -    5%) 0.733
                       CountTerm     9110.00      (4.3%)     9142.10      (3.2%)    0.4% (  -6% -    8%) 0.768
                    AndStopWords       29.47      (2.6%)       29.57      (1.3%)    0.4% (  -3% -    4%) 0.579
             And2Terms2StopWords      150.30      (2.1%)      150.86      (1.1%)    0.4% (  -2% -    3%) 0.487
                      OrHighRare      237.33      (5.7%)      238.26      (6.2%)    0.4% ( -10% -   13%) 0.837
                         MedTerm      553.55      (6.0%)      555.97      (7.7%)    0.4% ( -12% -   15%) 0.841
                        Wildcard       34.08      (3.2%)       34.25      (3.4%)    0.5% (  -5% -    7%) 0.630
                    OrNotHighLow      761.70      (3.2%)      766.33      (2.6%)    0.6% (  -5% -    6%) 0.511
              Or2Terms2StopWords      156.10      (3.2%)      157.14      (1.8%)    0.7% (  -4% -    5%) 0.416
                        Or3Terms      156.59      (3.0%)      157.70      (1.9%)    0.7% (  -4% -    5%) 0.374
                        HighTerm      440.27      (5.6%)      443.89      (7.5%)    0.8% ( -11% -   14%) 0.695
                         LowTerm      892.27      (5.2%)      900.48      (6.8%)    0.9% ( -10% -   13%) 0.632
                     OrStopWords       31.88      (4.7%)       32.29      (2.6%)    1.3% (  -5% -    9%) 0.276
                         Prefix3      214.22      (3.4%)      217.48      (2.8%)    1.5% (  -4% -    8%) 0.124
                   OrHighNotHigh      247.52      (4.8%)      254.52      (5.1%)    2.8% (  -6% -   13%) 0.071
                          IntNRQ      144.53     (17.2%)      148.66     (17.9%)    2.9% ( -27% -   45%) 0.607
                    OrNotHighMed      330.23      (6.5%)      340.12      (5.4%)    3.0% (  -8% -   15%) 0.114
                    OrHighNotMed      285.11      (5.2%)      293.82      (6.2%)    3.1% (  -7% -   15%) 0.092
                    OrHighNotLow      429.94      (5.4%)      443.15      (6.8%)    3.1% (  -8% -   16%) 0.113
                   OrNotHighHigh      189.30      (5.9%)      195.25      (5.4%)    3.1% (  -7% -   15%) 0.079
                  CountOrHighMed       99.90     (22.5%)      121.78     (20.0%)   21.9% ( -16% -   83%) 0.001
                 CountOrHighHigh       53.76     (35.1%)       70.24     (32.5%)   30.6% ( -27% -  151%) 0.004
```
@jpountz jpountz added this to the 9.12.0 milestone Jul 24, 2024
@jpountz jpountz changed the title Bump the window size of disjunction from 2,048 to 4,096. Bump the window size of disjunctions from 2,048 to 4,096. Jul 24, 2024
@jpountz jpountz merged commit 8d4f7a6 into apache:main Jul 25, 2024
@jpountz jpountz deleted the bump_disjunction_window_size branch July 25, 2024 13:38
jpountz added a commit that referenced this pull request Jul 31, 2024
It's been pointed multiple times that a difference between Tantivy and Lucene
is the fact that Tantivy uses windows of 4,096 docs when Lucene has a 2x
smaller window size of 2,048 docs and that this might explain part of the
performance difference. luceneutil suggests that bumping the window size to
4,096 does indeed improve performance for counting queries, but not for top-k
queries. I'm still suggesting to bump the window size across the board to keep
our disjunction scorer consistent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant