Skip to content

Stop using SlowImpactsEnum for terms whose docFreq is less than 128.#14017

Merged
jpountz merged 1 commit intoapache:mainfrom
jpountz:stop_using_slow_impacts_enum_under_128_doc_freq
Nov 25, 2024
Merged

Stop using SlowImpactsEnum for terms whose docFreq is less than 128.#14017
jpountz merged 1 commit intoapache:mainfrom
jpountz:stop_using_slow_impacts_enum_under_128_doc_freq

Conversation

@jpountz
Copy link
Contributor

@jpountz jpountz commented Nov 25, 2024

We currently use SlowImpactsEnum for terms whose docFreq is less than 128 because it's convenient as these terms don't have impacts anyway. But a recent slowdown on nightly benchmarks suggests that this contributes to making some hot calls more polymorphic than we'd like, so this PR moves such terms back to the regular impacts enums.

We currently use `SlowImpactsEnum` for terms whose `docFreq` is less than 128
because it's convenient as these terms don't have impacts anyway. But a recent
slowdown on nightly benchmarks suggests that this contributes to making some
hot calls more polymorphic than we'd like, so this PR moves such terms back to
the regular impacts enums.
@jpountz
Copy link
Contributor Author

jpountz commented Nov 25, 2024

With Combined tasks in the file (call is 3-polymorphic in the baseline, bimorphic in the modified version):

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                        PKLookup      277.89      (2.3%)      274.07      (2.6%)   -1.4% (  -6% -    3%) 0.079
             FilteredOrStopWords       33.44      (2.3%)       33.42      (1.9%)   -0.1% (  -4% -    4%) 0.935
            FilteredAndStopWords       48.48      (2.6%)       48.48      (1.8%)   -0.0% (  -4% -    4%) 0.994
     FilteredAnd2Terms2StopWords      195.64      (1.8%)      195.89      (1.4%)    0.1% (  -3% -    3%) 0.806
             FilteredAndHighHigh       62.16      (2.2%)       62.25      (1.7%)    0.1% (  -3% -    4%) 0.813
                  FilteredPhrase       25.15      (2.7%)       25.22      (2.5%)    0.3% (  -4% -    5%) 0.720
                          OrMany       18.96      (4.5%)       19.02      (5.6%)    0.3% (  -9% -   10%) 0.853
                FilteredOr3Terms      149.27      (2.0%)      149.75      (1.6%)    0.3% (  -3% -    4%) 0.574
                    CombinedTerm       34.03      (2.8%)       34.17      (2.2%)    0.4% (  -4% -    5%) 0.623
              FilteredOrHighHigh       52.50      (2.3%)       52.71      (1.8%)    0.4% (  -3% -    4%) 0.531
               FilteredOrHighMed      114.91      (2.3%)      115.46      (1.7%)    0.5% (  -3% -    4%) 0.465
                  FilteredOrMany        9.91      (2.8%)        9.96      (3.4%)    0.6% (  -5% -    6%) 0.572
               FilteredAnd3Terms      189.21      (1.8%)      190.35      (1.6%)    0.6% (  -2% -    4%) 0.257
      FilteredOr2Terms2StopWords      109.76      (1.6%)      110.43      (1.2%)    0.6% (  -2% -    3%) 0.169
                    FilteredTerm      159.61      (2.9%)      160.69      (2.2%)    0.7% (  -4% -    5%) 0.401
             CombinedAndHighHigh       15.58      (1.0%)       15.81      (0.8%)    1.5% (   0% -    3%) 0.000
              CombinedAndHighMed       57.04      (1.3%)       57.92      (0.8%)    1.5% (   0% -    3%) 0.000
              FilteredAndHighMed      123.14      (2.9%)      125.29      (3.0%)    1.8% (  -4% -    7%) 0.060
                       And3Terms      162.00      (4.3%)      167.29      (4.5%)    3.3% (  -5% -   12%) 0.019
             And2Terms2StopWords      152.25      (4.1%)      158.02      (4.2%)    3.8% (  -4% -   12%) 0.004
               CombinedOrHighMed       74.69      (0.7%)       78.15      (2.3%)    4.6% (   1% -    7%) 0.000
              CombinedOrHighHigh       19.55      (0.7%)       20.60      (2.5%)    5.4% (   2% -    8%) 0.000
                      AndHighMed      116.33      (4.0%)      122.79      (1.5%)    5.6% (   0% -   11%) 0.000
                     AndHighHigh       39.43      (4.2%)       41.74      (1.9%)    5.9% (   0% -   12%) 0.000
                    AndStopWords       27.82      (5.9%)       29.51      (6.7%)    6.1% (  -6% -   19%) 0.002
                      OrHighRare      265.91      (7.3%)      284.27      (5.1%)    6.9% (  -5% -   20%) 0.000
                        Or3Terms      155.29      (3.8%)      168.35      (5.7%)    8.4% (  -1% -   18%) 0.000
              Or2Terms2StopWords      146.73      (3.2%)      160.57      (5.7%)    9.4% (   0% -   18%) 0.000
                     OrStopWords       28.40      (5.6%)       32.31      (9.5%)   13.8% (  -1% -   30%) 0.000
                       OrHighMed      168.44      (3.9%)      197.35      (1.6%)   17.2% (  11% -   23%) 0.000
                      OrHighHigh       44.17      (4.6%)       53.19      (1.7%)   20.4% (  13% -   28%) 0.000

Without combined tasks in the file (call is bimorphic in the baseline, monomorphic in the modified version):

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
             FilteredOrStopWords       33.95      (1.7%)       33.73      (1.8%)   -0.7% (  -4% -    2%) 0.245
                  FilteredOrMany       10.23      (2.7%)       10.22      (2.9%)   -0.1% (  -5% -    5%) 0.928
              FilteredOrHighHigh       53.10      (1.9%)       53.11      (2.0%)    0.0% (  -3% -    3%) 0.962
                FilteredOr3Terms      150.83      (2.0%)      150.98      (2.3%)    0.1% (  -4% -    4%) 0.879
               FilteredOrHighMed      116.00      (2.0%)      116.30      (2.1%)    0.3% (  -3% -    4%) 0.690
                    FilteredTerm      161.02      (2.7%)      161.57      (2.3%)    0.3% (  -4% -    5%) 0.668
      FilteredOr2Terms2StopWords      110.97      (1.3%)      111.36      (1.3%)    0.4% (  -2% -    3%) 0.399
                  FilteredPhrase       25.49      (3.3%)       25.59      (2.8%)    0.4% (  -5% -    6%) 0.693
                       OrHighMed      195.25      (2.5%)      196.56      (1.9%)    0.7% (  -3% -    5%) 0.336
                      OrHighHigh       52.92      (2.6%)       53.32      (2.2%)    0.7% (  -3% -    5%) 0.323
              Or2Terms2StopWords      165.87      (3.3%)      167.13      (3.3%)    0.8% (  -5% -    7%) 0.471
                        Or3Terms      177.06      (3.5%)      178.44      (3.5%)    0.8% (  -6% -    8%) 0.484
                     OrStopWords       34.25      (5.2%)       34.64      (5.2%)    1.1% (  -8% -   12%) 0.492
                        PKLookup      274.02      (1.4%)      277.21      (1.4%)    1.2% (  -1% -    4%) 0.009
                      OrHighRare      282.75      (6.6%)      286.27      (6.9%)    1.2% ( -11% -   15%) 0.559
             And2Terms2StopWords      167.40      (3.0%)      170.65      (3.6%)    1.9% (  -4% -    8%) 0.064
                          OrMany       19.64      (4.0%)       20.22      (3.7%)    2.9% (  -4% -   11%) 0.017
                      AndHighMed      131.41      (1.8%)      135.53      (1.8%)    3.1% (   0% -    6%) 0.000
                     AndHighHigh       44.99      (1.9%)       46.55      (2.2%)    3.5% (   0% -    7%) 0.000
            FilteredAndStopWords       50.36      (2.8%)       52.27      (1.9%)    3.8% (   0% -    8%) 0.000
     FilteredAnd2Terms2StopWords      202.89      (2.1%)      210.87      (1.5%)    3.9% (   0% -    7%) 0.000
                    AndStopWords       32.04      (4.8%)       33.33      (5.9%)    4.0% (  -6% -   15%) 0.018
                       And3Terms      178.11      (3.8%)      185.46      (4.2%)    4.1% (  -3% -   12%) 0.001
             FilteredAndHighHigh       65.05      (2.8%)       68.05      (1.9%)    4.6% (   0% -    9%) 0.000
              FilteredAndHighMed      132.31      (3.0%)      139.26      (3.0%)    5.2% (   0% -   11%) 0.000
               FilteredAnd3Terms      195.92      (1.8%)      209.14      (1.8%)    6.7% (   3% -   10%) 0.000

@jpountz jpountz merged commit a5bf8a5 into apache:main Nov 25, 2024
@jpountz jpountz deleted the stop_using_slow_impacts_enum_under_128_doc_freq branch November 25, 2024 14:48
jpountz added a commit that referenced this pull request Nov 25, 2024
…28. (#14017)

We currently use `SlowImpactsEnum` for terms whose `docFreq` is less than 128
because it's convenient as these terms don't have impacts anyway. But a recent
slowdown on nightly benchmarks suggests that this contributes to making some
hot calls more polymorphic than we'd like, so this PR moves such terms back to
the regular impacts enums.
benchaplin pushed a commit to benchaplin/lucene that referenced this pull request Dec 31, 2024
…28. (apache#14017)

We currently use `SlowImpactsEnum` for terms whose `docFreq` is less than 128
because it's convenient as these terms don't have impacts anyway. But a recent
slowdown on nightly benchmarks suggests that this contributes to making some
hot calls more polymorphic than we'd like, so this PR moves such terms back to
the regular impacts enums.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant