Speed up terms agg when not force merged#71241
Conversation
This speeds up the `terms` aggregation when it can't take the fancy `filters` path, there is more than one segment, and any of those segments have only a single value for the field. These three things are super common. Here are the performance change numbers: ``` | 50th percentile latency | date-histo-string-terms-via-global-ords | 3414.02 | 2632.01 | -782.015 | ms | | 90th percentile latency | date-histo-string-terms-via-global-ords | 3470.91 | 2756.88 | -714.031 | ms | | 100th percentile latency | date-histo-string-terms-via-global-ords | 3620.89 | 2875.79 | -745.102 | ms | | 50th percentile service time | date-histo-string-terms-via-global-ords | 3410.15 | 2628.87 | -781.275 | ms | | 90th percentile service time | date-histo-string-terms-via-global-ords | 3467.36 | 2752.43 | -714.933 | ms | 20%!!!! | 100th percentile service time | date-histo-string-terms-via-global-ords | 3617.71 | 2871.63 | -746.083 | ms | ``` This works by hooking global ordinals into `DocValues.unwrapSingleton`. Without this you could unwrap singletons *if* the segment's ordinals aligned exactly with the global ordinals. If they didn't we'd return an doc values iterator that you can't unwrap. Even if the segment ordinals were singletons. That speeds up the terms aggregator because we have a fast path we can take if we have singletons. It was previously only working if we had a single segment. Or if the segment's ordinals lined up exactly. Which, for low cardinality fields is fairly common. So they might not benefit from this quite as much as high cardinality fields. Closes elastic#71086
|
Pinging @elastic/es-analytics-geo (Team:Analytics) |
imotov
left a comment
There was a problem hiding this comment.
Looks reasonable to me in general, but I think @not-napoleon should review it as well since he is much more familiar with this code.
| new SingletonGlobalOrdinalMapping(ordinalMap, singleton, atomicLookups, context.ord) | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
This feels a bit brittle. Basically, SingletonGlobalOrdinalMapping seems to be a very specialized class that can only function if its constructor parameters satisfy these very specific criteria that are neither documented nor enforced in the class itself. I wonder if we could make this less brittle if we made SingletonGlobalOrdinalMapping constructor private and moved the logic and comment above into a factory method that would return null in case the conditions are not satisfied.
There was a problem hiding this comment.
Yeah. The second test - that you can unwrap - is encoded in the ctor's signature. But the first bit - the value count - that totally should be guarded like you say.
not-napoleon
left a comment
There was a problem hiding this comment.
LGTM overall. Would like that constructor guard before merging.
| SingletonGlobalOrdinalMapping(OrdinalMap ordinalMap, SortedDocValues values, TermsEnum[] lookups, int segmentIndex) { | ||
| this.values = values; | ||
| this.lookups = lookups; | ||
| this.ordinalMap = ordinalMap; |
There was a problem hiding this comment.
I agree with Igor. At a minimum we should throw here if ordinalMap.getValueCount() >= MAX_INT. I think it's okay if we require callers to pre-check it rather than building a factory method (although a factory method would also be fine), but we shouldn't allow the construction of invalid objects.
| writer.addDocument(d); | ||
|
|
||
| d = new Document(); | ||
| addField(d, "_id", "1"); |
There was a problem hiding this comment.
Should the _id value here be 2? or are you intentionally duplicating IDs?
| random(), | ||
| directory, | ||
| // Attempt to disable merging. | ||
| LuceneTestCase.newIndexWriterConfig(random(), new StandardAnalyzer()).setMergePolicy(NoMergePolicy.INSTANCE) |
There was a problem hiding this comment.
This is handy. I wouldn't mind a method in AggregatorTestCase to get a non-merging(ish) index writer.
| } | ||
| } | ||
|
|
||
| public void testManySegmentsStillSingleton() throws IOException { |
There was a problem hiding this comment.
I'm a little unhappy that this test doesn't run through AggregatorTestCase.testCase, or at least use AggregatorTestCase.searchAndReduce. I think that's because you want to assert on the debug info which we don't currently make easy. If that's correct, I don't think we need to address it here, but I would like to leave a todo and maybe a ticket.
There was a problem hiding this comment.
I've added something for this in #71503 and will update the PR when that one lands.
|
@imotov do you want to have another look at this one? |
|
Thanks for all the reviews! |
This speeds up the `terms` aggregation when it can't take the fancy `filters` path, there is more than one segment, and any of those segments have only a single value for the field. These three things are super common. Here are the performance change numbers: ``` | 50th percentile latency | date-histo-string-terms-via-global-ords | 3414.02 | 2632.01 | -782.015 | ms | | 90th percentile latency | date-histo-string-terms-via-global-ords | 3470.91 | 2756.88 | -714.031 | ms | | 100th percentile latency | date-histo-string-terms-via-global-ords | 3620.89 | 2875.79 | -745.102 | ms | | 50th percentile service time | date-histo-string-terms-via-global-ords | 3410.15 | 2628.87 | -781.275 | ms | | 90th percentile service time | date-histo-string-terms-via-global-ords | 3467.36 | 2752.43 | -714.933 | ms | 20%!!!! | 100th percentile service time | date-histo-string-terms-via-global-ords | 3617.71 | 2871.63 | -746.083 | ms | ``` This works by hooking global ordinals into `DocValues.unwrapSingleton`. Without this you could unwrap singletons *if* the segment's ordinals aligned exactly with the global ordinals. If they didn't we'd return an doc values iterator that you can't unwrap. Even if the segment ordinals were singletons. That speeds up the terms aggregator because we have a fast path we can take if we have singletons. It was previously only working if we had a single segment. Or if the segment's ordinals lined up exactly. Which, for low cardinality fields is fairly common. So they might not benefit from this quite as much as high cardinality fields. Closes elastic#71086
This speeds up the `terms` aggregation when it can't take the fancy `filters` path, there is more than one segment, and any of those segments have only a single value for the field. These three things are super common. Here are the performance change numbers: ``` | 50th percentile latency | date-histo-string-terms-via-global-ords | 3414.02 | 2632.01 | -782.015 | ms | | 90th percentile latency | date-histo-string-terms-via-global-ords | 3470.91 | 2756.88 | -714.031 | ms | | 100th percentile latency | date-histo-string-terms-via-global-ords | 3620.89 | 2875.79 | -745.102 | ms | | 50th percentile service time | date-histo-string-terms-via-global-ords | 3410.15 | 2628.87 | -781.275 | ms | | 90th percentile service time | date-histo-string-terms-via-global-ords | 3467.36 | 2752.43 | -714.933 | ms | 20%!!!! | 100th percentile service time | date-histo-string-terms-via-global-ords | 3617.71 | 2871.63 | -746.083 | ms | ``` This works by hooking global ordinals into `DocValues.unwrapSingleton`. Without this you could unwrap singletons *if* the segment's ordinals aligned exactly with the global ordinals. If they didn't we'd return an doc values iterator that you can't unwrap. Even if the segment ordinals were singletons. That speeds up the terms aggregator because we have a fast path we can take if we have singletons. It was previously only working if we had a single segment. Or if the segment's ordinals lined up exactly. Which, for low cardinality fields is fairly common. So they might not benefit from this quite as much as high cardinality fields. Closes #71086
This speeds up the
termsaggregation when it can't take the fancyfilterspath, there is more than one segment, and any of thosesegments have only a single value for the field. These three things are
super common.
Here are the performance change numbers:
This works by hooking global ordinals into
DocValues.unwrapSingleton.Without this you could unwrap singletons if the segment's ordinals
aligned exactly with the global ordinals. If they didn't we'd return an
doc values iterator that you can't unwrap. Even if the segment ordinals
were singletons.
That speeds up the terms aggregator because we have a fast path we can
take if we have singletons. It was previously only working if we had a
single segment. Or if the segment's ordinals lined up exactly. Which,
for low cardinality fields is fairly common. So they might not benefit
from this quite as much as high cardinality fields.
Closes #71086