Mask wildcard query special characters on keyword queries#53127
Mask wildcard query special characters on keyword queries#53127cbuescher merged 12 commits intoelastic:masterfrom
Conversation
Wildcard queries on keyword fields get normalized, however this normalization step should exclude the two special characters * and ? in order to keep the wildcard query itself intact. Closes elastic#46300
|
Pinging @elastic/es-search (:Search/Analysis) |
server/src/main/java/org/elasticsearch/index/mapper/KeywordFieldMapper.java
Outdated
Show resolved
Hide resolved
|
@jimczi pushed some changes moving the code like you suggested |
|
@elasticmachine update branch |
|
merge conflict between base and head |
jimczi
left a comment
There was a problem hiding this comment.
The StringFieldType looks good to me.
I am less sure about the modifications in the _type field. I left a comment on how we can address it but that can be done in a follow up if you prefer.
server/src/main/java/org/elasticsearch/index/mapper/TypeFieldMapper.java
Outdated
Show resolved
Hide resolved
|
@jimczi @romseygeek thanks for the review, I added a commit that changes TypeFieldType to extend ConstantFieldType like you suggested and adapted tests where necessary. Hope this looks okay now. |
We can have only one type in 7x, but it may be something other than |
server/src/main/java/org/elasticsearch/index/mapper/TypeFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ConstantFieldType.java
Outdated
Show resolved
Hide resolved
jimczi
left a comment
There was a problem hiding this comment.
The change looks good to me. I left one comment regarding the handling of prefix and wildcard queries for _type field. Feel free to address it without further review.
| result = new MatchNoDocsQuery("[_type] was lexicographically greater than upper bound of range"); | ||
| } | ||
| } | ||
| protected boolean matches(String pattern, QueryShardContext context) { |
There was a problem hiding this comment.
I don't think we need to handle wildcard and prefixes here ? We don't support prefix and wildcard queries on the _type field today and since _type are now a thing from the past I don't think we should add this ability. Just checking that the pattern exactly matches the internal type should be enough.
|
@elasticmachine run elasticsearch-ci/2 |
|
@elasticmachine update branch |
) Wildcard queries on keyword fields get normalized, however this normalization step should exclude the two special characters * and ? in order to keep the wildcard query itself intact. Closes elastic#46300
|
@cbuescher I removed the |
Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
…ic#71751) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
…ic#71751) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
… (#72214) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
… (#72216) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
…ic#71751) (elastic#72214) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in elastic#53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes elastic#71403
… (#72224) Wildcard queries on text fields should not apply the fields analyzer to the search query. However, we accidentally enabled this in #53127 by moving the query normalization to the StringFieldType super type. This change fixes this by separating the notion of normalization and case insensitivity (as implemented in the `case_insensitive` flag). This is done because we still need to maintain normalization of the query sting when the wildcard query method on the field type is requested from the `query_string` query parser. Wildcard queries on keyword fields should also continue to apply the fields normalizer, regardless of whether the `case_insensitive` is set, because normalization could involve something else than lowercasing (e.g. substituting umlauts like in the GermanNormalizationFilter). Closes #71403
Wildcard queries on keyword fields should get normalized, however this normalization
should exclude the two special characters * and ? in order to keep the wildcard query
itself intact.
Closes #46300