Skip to content

Refactor simple_query_string to handle text part like multi_match and query_string#26145

Merged
jimczi merged 3 commits intoelastic:masterfrom
jimczi:simple_query_string_fields
Aug 21, 2017
Merged

Refactor simple_query_string to handle text part like multi_match and query_string#26145
jimczi merged 3 commits intoelastic:masterfrom
jimczi:simple_query_string_fields

Conversation

@jimczi
Copy link
Copy Markdown
Contributor

@jimczi jimczi commented Aug 10, 2017

This change is a continuation of #25726 that aligns field expansions and text analysis for the simple_query_string with the query_string and multi_match query.
The main changes are:

  • For exact field name, the new behavior is to rewrite to a matchnodocs query when the field name is not found in the mapping.

  • For partial field names (with * suffix), the expansion is done only on keyword, text, date, ip and number field types. Other field types are simply ignored.

  • For all fields (*), the expansion is done on accepted field types only (see above) and metadata fields are also filtered.

The use_all_fields option is deprecated in this change and can be replaced by setting * in the fields parameter.
This commit also changes how text fields are analyzed. Previously the default search analyzer (or the provided analyzer) was used to analyze every text part
, ignoring the analyzer set on the field in the mapping. With this change, the field analyzer is used instead unless an analyzer has been forced in the parameter of the query.

Finally now that all full text queries can handle the special "*" expansion (all_fields mode), the index.query.default_field is now set to * for indices created in 6.

@jimczi jimczi added :Search/Search Search-related issues that do not fall into other categories >bug v6.0.0 v7.0.0 labels Aug 10, 2017
@jimczi jimczi requested a review from dakrone August 10, 2017 15:33
@dakrone dakrone added the v6.1.0 label Aug 10, 2017
Copy link
Copy Markdown
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on this, thanks for tackling this!

The use_all_fields option is deprecated in this change and can be replaced by setting * in the fields parameter.

I think you mean the all_fields option?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is technically breaking, because prior to this change the following would work:

builder.field("foo", 1.5);
builder.useAllFields(true);
builder.useAllFields(false);

And since setting it to true permanently changes the fieldsAndWeights. While it seems contrived, I could see user code doing this passing the builder off to independent functions, the builder should be idempotent in this case

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also breaks what I would expect to happen with:

builder.useAllFields(true);
builder.field("foo", 1.0);

Where I would expect it to use all_fields mode, but instead it does not work since the order matters.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latter case it would fail the query early because useAllFields freeze the fieldsAndWeights map. I think it's better than the current behavior which accepts this query but fails when the query is actually built (fields and useAllFields are mutually exclusive) ?

For the first example I don't know if it's really a problem. Maybe we should just disallow useAllFields(false) ? I don't see a good reason to unset it especially now that the index default field defaults to * ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latter case it would fail the query early because useAllFields freeze the fieldsAndWeights map. I think it's better than the current behavior which accepts this query but fails when the query is actually built (fields and useAllFields are mutually exclusive) ?

Sure, that sounds fine, as long as we're okay with this failing (esoteric) use-case.

For the first example I don't know if it's really a problem. Maybe we should just disallow useAllFields(false) ? I don't see a good reason to unset it especially now that the index default field defaults to * ?

Yeah, I don't either, the only argument I can think of keeping useAllFields(boolean) is that then it's apparent it's a "setter" and not a "getter"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this does need to calculate whether all fields mode is going to be used, otherwise a mixed 5.6 - 6.0 cluster will have the behavior correct on 6.0 and incorrect on 5.6, since 5.6 calculates the decision for whether to use all_fields using different criteria than 6.0 will?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I pushed 456ceac to fix this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add javadocs for this class please?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also javadocs for all methods in this class too please

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this no longer needs to be fully qualified since the class name is not the same any more

@jimczi
Copy link
Copy Markdown
Contributor Author

jimczi commented Aug 10, 2017

Thanks for looking @dakrone
I pushed some changes to address your comment.

I think you mean the all_fields option?

Yes, I don't know why but I always get the name wrong ;)

Copy link
Copy Markdown
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @jimczi

jimczi added 3 commits August 21, 2017 12:13
… query_string

This change is a continuation of elastic#25726 that aligns field expansions for the simple_query_string with the query_string and multi_match query.
The main changes are:

 * For exact field name, the new behavior is to rewrite to a matchnodocs query when the field name is not found in the mapping.

 * For partial field names (with * suffix), the expansion is done only on keyword, text, date, ip and number field types. Other field types are simply ignored.

 * For all fields (*), the expansion is done on accepted field types only (see above) and metadata fields are also filtered.

The use_all_fields option is deprecated in this change and can be replaced by setting `*` in the fields parameter.
This commit also changes how text fields are analyzed. Previously the default search analyzer (or the provided analyzer) was used to analyze every text part
, ignoring the analyzer set on the field in the mapping. With this change, the field analyzer is used instead unless an analyzer has been forced in the parameter of the query.

Finally now that all full text queries can handle the special "*" expansion (`all_fields` mode), the `index.query.default_field` is now set to `*` for indices created in 6.
@jimczi jimczi force-pushed the simple_query_string_fields branch from 456ceac to 28d92b6 Compare August 21, 2017 11:11
@jimczi jimczi merged commit 4bce727 into elastic:master Aug 21, 2017
@jimczi jimczi deleted the simple_query_string_fields branch August 21, 2017 11:12
jimczi added a commit that referenced this pull request Aug 21, 2017
… query_string (#26145)

This change is a continuation of #25726 that aligns field expansions for the simple_query_string with the query_string and multi_match query.
The main changes are:

 * For exact field name, the new behavior is to rewrite to a matchnodocs query when the field name is not found in the mapping.

 * For partial field names (with * suffix), the expansion is done only on keyword, text, date, ip and number field types. Other field types are simply ignored.

 * For all fields (*), the expansion is done on accepted field types only (see above) and metadata fields are also filtered.

The use_all_fields option is deprecated in this change and can be replaced by setting `*` in the fields parameter.
This commit also changes how text fields are analyzed. Previously the default search analyzer (or the provided analyzer) was used to analyze every text part
, ignoring the analyzer set on the field in the mapping. With this change, the field analyzer is used instead unless an analyzer has been forced in the parameter of the query.

Finally now that all full text queries can handle the special "*" expansion (`all_fields` mode), the `index.query.default_field` is now set to `*` for indices created in 6.
jimczi added a commit that referenced this pull request Aug 21, 2017
… query_string (#26145)

This change is a continuation of #25726 that aligns field expansions for the simple_query_string with the query_string and multi_match query.
The main changes are:

 * For exact field name, the new behavior is to rewrite to a matchnodocs query when the field name is not found in the mapping.

 * For partial field names (with * suffix), the expansion is done only on keyword, text, date, ip and number field types. Other field types are simply ignored.

 * For all fields (*), the expansion is done on accepted field types only (see above) and metadata fields are also filtered.

The use_all_fields option is deprecated in this change and can be replaced by setting `*` in the fields parameter.
This commit also changes how text fields are analyzed. Previously the default search analyzer (or the provided analyzer) was used to analyze every text part
, ignoring the analyzer set on the field in the mapping. With this change, the field analyzer is used instead unless an analyzer has been forced in the parameter of the query.

Finally now that all full text queries can handle the special "*" expansion (`all_fields` mode), the `index.query.default_field` is now set to `*` for indices created in 6.
jimczi added a commit that referenced this pull request Aug 21, 2017
jimczi added a commit that referenced this pull request Aug 21, 2017
jimczi added a commit that referenced this pull request Aug 21, 2017
jimczi added a commit that referenced this pull request Aug 21, 2017
@lcawl lcawl removed the v6.1.0 label Dec 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search/Search Search-related issues that do not fall into other categories v6.0.0-beta2 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants