LUCENE-9578: TermRangeQuery empty string lower bound edge case#1976
Merged
jpountz merged 4 commits intoapache:masterfrom Oct 16, 2020
Merged
LUCENE-9578: TermRangeQuery empty string lower bound edge case#1976jpountz merged 4 commits intoapache:masterfrom
jpountz merged 4 commits intoapache:masterfrom
Conversation
Currently a TermRangeQuery with the empty String ("") as lower bound and
includeLower=false leads internally constructs an Automaton that doesn't match
anything. This is unexpected expecially for open upper bounds where any string
should be considered to be "higher" than the empty string.
This PR changes "Automata#makeBinaryInterval" so that for an empty string lower
bound and an open upper bound, any String should match the query regardless or
the includeLower flag.
jpountz
reviewed
Oct 12, 2020
Contributor
Author
|
@jpountz thanks for the review, I added a commit that rejects the empty string when "includeMin == false" and added tests for that edge case as well. |
jpountz
reviewed
Oct 14, 2020
lucene/core/src/java/org/apache/lucene/util/automaton/Automata.java
Outdated
Show resolved
Hide resolved
Contributor
Author
|
@jpountz thanks for the review, I added another commit that should fix the problem you detected. |
jpountz
reviewed
Oct 15, 2020
| * Returns a new (deterministic) automaton that accepts all binary terms except | ||
| * the empty string. | ||
| */ | ||
| public static Automaton makeAnyBinaryExceptEmpty() { |
Contributor
There was a problem hiding this comment.
Maybe call makeNonEmptyBinary?
Contributor
Contributor
Author
|
@jpountz thanks, I renamed the method. btw. would this change also be backported to the latest 8.7 branch so we can use it in ES to fix elastic/elasticsearch#63386? |
Contributor
|
The 8.7 branch has not been cut yet so this will be in 8.7. This change looked safe to me so I felt free to merge in spite of the branch being imminently cut. |
jpountz
pushed a commit
that referenced
this pull request
Oct 16, 2020
Currently a TermRangeQuery with the empty String ("") as lower bound and
includeLower=false leads internally constructs an Automaton that doesn't match
anything. This is unexpected expecially for open upper bounds where any string
should be considered to be "higher" than the empty string.
This PR changes "Automata#makeBinaryInterval" so that for an empty string lower
bound and an open upper bound, any String should match the query regardless or
the includeLower flag.
asfgit
pushed a commit
that referenced
this pull request
Oct 26, 2020
Currently a TermRangeQuery with the empty String ("") as lower bound and
includeLower=false leads internally constructs an Automaton that doesn't match
anything. This is unexpected expecially for open upper bounds where any string
should be considered to be "higher" than the empty string.
This PR changes "Automata#makeBinaryInterval" so that for an empty string lower
bound and an open upper bound, any String should match the query regardless or
the includeLower flag.
msfroh
pushed a commit
to msfroh/lucene-solr
that referenced
this pull request
Nov 18, 2020
…e#1976) Currently a TermRangeQuery with the empty String ("") as lower bound and includeLower=false leads internally constructs an Automaton that doesn't match anything. This is unexpected expecially for open upper bounds where any string should be considered to be "higher" than the empty string. This PR changes "Automata#makeBinaryInterval" so that for an empty string lower bound and an open upper bound, any String should match the query regardless or the includeLower flag.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
Currently a TermRangeQuery with the empty String ("") as lower bound and
includeLower=false leads internally constructs an Automaton that doesn't match
anything. This is unexpected expecially for open upper bounds where any string
should be considered to be "higher" than the empty string.
Solution
This PR changes "Automata#makeBinaryInterval" so that for an empty string lower
bound and an open upper bound, any String should match the query regardless or
the includeLower flag.
Tests
Added two new tests to
TestAutomaton.Checklist
Please review the following and check all that apply:
masterbranch../gradlew check