Search optimisation - add canMatch early aborts for queries on "_index" field#48681
Search optimisation - add canMatch early aborts for queries on "_index" field#48681markharwood merged 4 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-search (:Search/Search) |
jimczi
left a comment
There was a problem hiding this comment.
The logic looks good to me. The original issue also mentions terms query and it should also be possible to add the logic to the prefix query ?
server/src/main/java/org/elasticsearch/index/query/TermQueryBuilder.java
Outdated
Show resolved
Hide resolved
d7487dc to
b8338e2
Compare
jpountz
left a comment
There was a problem hiding this comment.
The approach looks good to me in general, can you add tests with aliases too?
There was a problem hiding this comment.
We should try to use the indexNameMatcher so that it works with aliases too?
There was a problem hiding this comment.
I did wonder about supporting aliases but wouldn't that give false negatives? I assume the indexed term in the _index field is only the physical index name so it would be a mistake to fail the canMatch phase if a query term happens to match an alias? My assumption was this issue is about optimising existing index name matching behaviour, not changing it to add alias support.
There was a problem hiding this comment.
Should it be:
| if (shardContext != null && shardContext.index().getName().startsWith(value) == false) { | |
| if (shardContext != null && shardContext.indexMatches(value + "*") == false) { |
?
b8338e2 to
64814e9
Compare
|
I've added tests but a couple of anomalies exist:
|
|
test this please |
f8f37d5 to
dde93fd
Compare
jimczi
left a comment
There was a problem hiding this comment.
I left more comments.
Prefix and wildcard queries on "_index" field work with physical index names but not alias names (term and terms queries work with both physical and alias names). A known limitation?
It should work the same. I added a comment on the prefix query rewrite, maybe that's the reason you see it not working ?
"Skipped" counting logic in results is a bit of a mystery - always looks to be one less than what I would expect to see. The 70_skip_shards yaml test index needed 2 shards before it would count just one as skipped. What is the expected behaviour?
When all shards are skipped we use one to produce an empty result. The logic is explained here:
I guess we could avoid doing this if there are no aggregations but that's out of scope for this pr.
There was a problem hiding this comment.
Should it be:
| if (shardContext != null && shardContext.index().getName().startsWith(value) == false) { | |
| if (shardContext != null && shardContext.indexMatches(value + "*") == false) { |
?
There was a problem hiding this comment.
BytesRefs.toString is not needed ?
| if (shardContext != null && shardContext.indexMatches(BytesRefs.toString(value)) == false) { | |
| if (shardContext != null && shardContext.indexMatches(value) == false) { |
There was a problem hiding this comment.
Unlike wildcardquery value is an object not a string. I remember having a test failure where I assumed String and it was a BytesRef so had to add this
There was a problem hiding this comment.
The prefix change suggestion looks to have fixed things, thanks
1b3e0b3 to
9fe82da
Compare
jpountz
left a comment
There was a problem hiding this comment.
I left two minor comments but it looks good to me otherwise.
There was a problem hiding this comment.
this parameter is supposed to be for bw compat of the response format only, can you use track_total_hits=true instead?
There was a problem hiding this comment.
any reason not to reuse Regex.simpleMatch?
961344f to
e21a561
Compare
|
test this please |
…n index that doesn’t match the query expression. Part of the “canMatch” phase optimisations. Closes elastic#48473
* rest_total_hits_as_int -> track_total_hits * Use Regex.simpleMatch
5a79628 to
6ac08b1
Compare
|
test this please |
…x" field (elastic#48681) Make queries on the “_index” field fast-fail if the target shard is an index that doesn’t match the query expression. Part of the “canMatch” phase optimisations. Closes elastic#48473
Make queries on the “_index” field fast-fail if the target shard is an index name that doesn’t match the query expression. Part of the “canMatch” phase optimisations.
Closes #48473