Fix a bug with missing fields in sig_terms#57757
Conversation
When you run a `significant_terms` aggregation on a field and it *is* mapped but there aren't any values for it then the count of the documents that match the query on that shard still have to be added to the overall doc count. I broke that in elastic#57361. This fixes that. Closes elastic#57402
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
|
Two little notes:
|
| } | ||
| } | ||
|
|
||
| public void testAllDocsWithoutStringFieldviaGlobalOrds() throws IOException { |
There was a problem hiding this comment.
These actually reproduce the issue.
| } | ||
| } | ||
|
|
||
| public void testSomeDocsWithoutStringFieldviaGlobalOrds() throws IOException { |
There was a problem hiding this comment.
I thought this would reproduce the issue but it doesn't. It still feels useful to add.
| } | ||
| } | ||
|
|
||
| public void testThreeLayerStringViaGlobalOrds() throws IOException { |
There was a problem hiding this comment.
These are coming in #57758 and I thought they might reproduce the issue so I added them as well. No dice. Either way, they are nice to have.
| IndexReader topReader = searcher.getIndexReader(); | ||
| int supersetSize = topReader.numDocs(); | ||
| return new SignificantStringTerms(name, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), | ||
| metadata(), format, 0, supersetSize, significanceHeuristic, emptyList()); |
not-napoleon
left a comment
There was a problem hiding this comment.
LGTM. Left a few asks around clarifying tests.
| testAllDocsWithoutStringField("map"); | ||
| } | ||
|
|
||
| private void testAllDocsWithoutStringField(String executionHint) throws IOException { |
There was a problem hiding this comment.
This test (or it's mapped/global wrappers, if you'd rather) needs javadoc. It's not at all clear from the name what path this is intended to exercise.
| try (Directory dir = newDirectory()) { | ||
| try (RandomIndexWriter writer = new RandomIndexWriter(random(), dir)) { | ||
| Document d = new Document(); | ||
| d.add(new SortedDocValuesField("f", new BytesRef("f"))); |
There was a problem hiding this comment.
Does d not get added to the index? Why are we even creating this document?
There was a problem hiding this comment.
That'd be a copy and paste error. Sorry!
| public void testAllDocsWithoutNumericField() throws IOException { | ||
| try (Directory dir = newDirectory()) { | ||
| try (RandomIndexWriter writer = new RandomIndexWriter(random(), dir)) { | ||
| Document d = new Document(); |
There was a problem hiding this comment.
As above, why do we even need this document?
|
Thanks @not-napoleon! I'll push a cleanup from your review comments in a moment and merge when CI is happy. |
When you run a `significant_terms` aggregation on a field and it *is* mapped but there aren't any values for it then the count of the documents that match the query on that shard still have to be added to the overall doc count. I broke that in elastic#57361. This fixes that. Closes elastic#57402
When you run a
significant_termsaggregation on a field and it ismapped but there aren't any values for it then the count of the
documents that match the query on that shard still have to be added to
the overall doc count. I broke that in #57361. This fixes that.
Closes #57402