Mask wildcard query special characters on keyword queries (#53127)#53512
Mask wildcard query special characters on keyword queries (#53127)#53512cbuescher merged 4 commits intoelastic:7.xfrom
Conversation
) Wildcard queries on keyword fields get normalized, however this normalization step should exclude the two special characters * and ? in order to keep the wildcard query itself intact. Closes elastic#46300
|
Pinging @elastic/es-search (:Search/Search) |
server/src/test/java/org/elasticsearch/index/mapper/TypeFieldTypeTests.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ConstantFieldType.java
Outdated
Show resolved
Hide resolved
158edb8 to
0b8e662
Compare
| WildcardQueryBuilder builder = QueryBuilders.wildcardQuery("_type", "doc*"); | ||
| builder.doToQuery(createShardContext()); | ||
| assertWarnings(QueryShardContext.TYPES_DEPRECATION_MESSAGE); | ||
| } |
There was a problem hiding this comment.
I removed this test for wildcard queries on "_type" fields. These don't return anything useful at the moment already, however before the change in this PR we still got an empty response while now we get an error because "_type" is not indexed. I don't know if we can tolerate this, otherwise I need to add a very simple wildcard implementation to TypeFieldType again.
There was a problem hiding this comment.
I'm not following, _type should still be indexed in 7x? It's only in 8x that it will be gone as a field.
There was a problem hiding this comment.
_typeshould still be indexed in 7x
Thats true, the test above only checks for the deprecation message though. Currently if you do a wildcard query on a "_type" field, it returns no hits (even for exact values like "value": "_doc"). Thats why I think nobody should really do that kind of query on a "_type" field. With the changes in this PR we would error in those cases though. If you want I can try to work around that.
There was a problem hiding this comment.
Erroring is better than silent weirdness, let's leave it as is.
There was a problem hiding this comment.
Sorry, I meant to write "except for exact values like "_doc" above. So that works on 7.x, but I hardly believe anybody relies on that? If you change your mind after this clarification let me know.
There was a problem hiding this comment.
… starting to think better safe than sorry, I think its a small addition that will keep the existing (weird) behaviour
I’ll look into that for a bit, leave it if it gets too ugly
|
@jimczi @romseygeek this is the backport from #53127, no real changes to the implementation in StringFieldType that do the actual wildcard-related work, but I had to revert the changes for the TypeFieldType, going back to it extending StringFieldType because otherwise I had several errors with nested fields etc... that were hard to work around. I removed on test I commented on above, not sure if thats acceptable, otherwise I can add a simple wildcard impl. to TypeFieldType that at least won't throw an error. Wdyt? |
Yes, in 7x the |
|
What's the latest with this PR? I'm dependent on this to backport #53851 for my wildcard field. |
|
@romseygeek @markharwood sorry for dropping the ball here, I left a comment around the removed tests, explaining why I think throwing an error there now wouldn't matter that much. Let me know if you would like me to change anything around that. |
romseygeek
left a comment
There was a problem hiding this comment.
LGTM, thanks @cbuescher
Wildcard queries on keyword fields get normalized, however this normalization
step should exclude the two special characters * and ? in order to keep the
wildcard query itself intact.
Backport of #53127
Changes wrt. the commit on master: also changed TypeFieldType to extend ConstantFieldType, but we still need to support matching "_type" values other than "_doc" on 7.x. Also there were tests checking that range queries on types work that are not there anymore on master, but this still seems to work on 7.x so I left the
rangeimplementation in TypeFieldType but had to adapt it slightly after changing the supertype.