Skip to content

Add index_prefix option to text fields#28222

Closed
romseygeek wants to merge 5 commits intoelastic:masterfrom
romseygeek:topic/27049-prefix-index-field
Closed

Add index_prefix option to text fields#28222
romseygeek wants to merge 5 commits intoelastic:masterfrom
romseygeek:topic/27049-prefix-index-field

Conversation

@romseygeek
Copy link
Copy Markdown
Contributor

This adds the ability to index term prefixes into a hidden subfield, enabling prefix queries to be run without multitermquery rewrites. The subfield reuses the analysis chain of its parent text field, appending an EdgeNGramTokenFilter. It can be configured with minimum and maximum ngram lengths. Query terms with lengths outside this min-max range fall back to using prefix queries against the parent text field.

The mapping looks like this:

"my_text_field" : {
    "type" : "text",
    "analyzer" : "english",
    "index_prefix" : { "min_chars" : 1, "max_chars" : 10 }
}

@romseygeek romseygeek added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 v6.3.0 labels Jan 15, 2018
@romseygeek romseygeek self-assigned this Jan 15, 2018
@romseygeek romseygeek requested review from jimczi and jpountz January 15, 2018 14:05
@romseygeek
Copy link
Copy Markdown
Contributor Author

This is still a work-in-progress, and needs more comprehensive tests + docs, but I'd like to get some feedback on whether or not this is a sensible implementation.

if (prefixAnalyzer == null || prefixAnalyzer.accept(value.length()) == false) {
return super.prefixQuery(value, method, context);
}
TermQuery q = new TermQuery(new Term(name() + "._prefix", indexedValueForSearch(value)));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anything prevents a user from creating an explicit field with the same name?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, no. Do we have a way of reserving field names elsewhere?

@jpountz
Copy link
Copy Markdown
Contributor

jpountz commented Jan 16, 2018

I think you are on the right track.

@rjernst raises a good point that there could be conflicts if a user configures a multi-field that also has _prefix as a name.

Do we have a way of reserving field names elsewhere?

I don't think we do. We only reserve fields that start with _ on the top level. I think the only restriction that we put on inner levels is that fields cannot contain a dot. Thinking out loud: would calling the field ${field_name}..prefix be a viable option? Such a field name should be illegal for regular fields.

@romseygeek
Copy link
Copy Markdown
Contributor Author

Closing in favour of #28290

@romseygeek romseygeek closed this Jan 29, 2018
@romseygeek romseygeek deleted the topic/27049-prefix-index-field branch January 29, 2018 10:01
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search/Search Search-related issues that do not fall into other categories v6.3.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants