Conversation
|
Pinging @elastic/es-search |
jimczi
left a comment
There was a problem hiding this comment.
The change looks good overall @romseygeek , I left some comments.
| @@ -0,0 +1,58 @@ | |||
| setup: | |||
| - skip: | |||
There was a problem hiding this comment.
Is this working ? I was not aware that the skip section can be set in the setup part but if it's working... ;)
| if (posAtt.getPositionIncrement() == 1) { | ||
| if (synonyms.size() == 1) { | ||
| terms.add(synonyms.get(0)); | ||
| } |
There was a problem hiding this comment.
nit: can you add an else to desambiguate ?
| import java.util.Objects; | ||
|
|
||
| import static org.elasticsearch.common.xcontent.ConstructingObjectParser.constructorArg; | ||
|
|
There was a problem hiding this comment.
Can you add javadocs to explain the kind of queries that this builder handles ?
|
|
||
| import static org.elasticsearch.common.xcontent.ConstructingObjectParser.constructorArg; | ||
| import static org.elasticsearch.common.xcontent.ConstructingObjectParser.optionalConstructorArg; | ||
|
|
There was a problem hiding this comment.
Same here, this is the user API so I'd expect some explanations regarding the different options ?
server/src/main/java/org/elasticsearch/search/SearchModule.java
Outdated
Show resolved
Hide resolved
|
I've added some better YAML tests and some basic documentation in the query DSL reference. I'm still a bit unsure of the DSL interface and some of the terms I'm using (eg |
rest
Outdated
|
|
||
| ./gradlew :distribution:archives:integ-test-zip:integTest \ | ||
| -Dtests.class="org.elasticsearch.test.rest.*Yaml*IT" \ | ||
| -Dtests.method="test {p0=$1}" |
There was a problem hiding this comment.
I wonder what is the role of this file?
There was a problem hiding this comment.
It makes running rest tests easier, but I didn't mean to commit it :) Will remove
|
After conferring with @clintongormley I've reworked the API somewhat. My one reservation is the name of the query, which is still |
jpountz
left a comment
There was a problem hiding this comment.
This is very clean overall. I'm curious why you decided to support a filter element in every intervals source instead of eg. having a filter intervals source?
Regarding the name, I don't dislike "intervals" as this is their name in the paper that introduced them. If we give them a more general name, I'm afraid that users would be even more surprised eg. by the behavior of any_of as it only returns minimal intervals?
| favourite food is porridge` would not match, because the interval matching | ||
| `cold porridge` starts before the interval matching `my favourite food`. | ||
|
|
||
| [[intervals-match]] |
There was a problem hiding this comment.
Huge +1 to exposing a match and no term
| @Override | ||
| public IntervalsSource intervals(String text, int maxGaps, boolean ordered, NamedAnalyzer analyzer) throws IOException { | ||
| if (indexOptions().compareTo(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS) < 0) { | ||
| throw new IllegalArgumentException("Cannot create source against field [" + name() + "] with no positions indexed"); |
There was a problem hiding this comment.
maybe say "intervals source" instead of just source?
| /** | ||
| * Constructs an IntervalsSource based on analyzed text | ||
| */ | ||
| public class IntervalBuilder { |
There was a problem hiding this comment.
Can we make it pkg-private?
There was a problem hiding this comment.
It's called from TextFieldMapper which is in a different package, so unfortunately we can't.
There was a problem hiding this comment.
should we move it to the same package to keep visibility to a minimum?
There was a problem hiding this comment.
It's also called from IntervalsSourceProvider, which doesn't really fit in the mapper package. I think it has to stay public...
| default: | ||
| provider = IntervalsSourceProvider.fromXContent(parser); | ||
|
|
||
| } |
There was a problem hiding this comment.
it seems that this won't fail if multiple providers are provided?
| MappedFieldType fieldType = context.fieldMapper(field); | ||
| if (fieldType == null) { | ||
| throw new IllegalArgumentException("Cannot create IntervalQuery over non-existent field [" + field + "]"); | ||
| } |
There was a problem hiding this comment.
in general we are lenient with unmapped fields because of cross-index search, should we be lenient here too?
| The `or` rule will match any of its nested sub-rules. | ||
|
|
||
| [horizontal] | ||
| `intervals`:: |
There was a problem hiding this comment.
Maybe we need to add documentation here about the fact that this only returns minimal intervals and consequences?
There was a problem hiding this comment.
I've added a section on minimization at the end of the doc, with some examples of queries that can produce surprising results, and how to deal with that.
|
I've pushed some changes addressing your comments @jpountz. For the filtering, it seemed to make more sense to add it as an option on to each rule, because the intervals produced by a filter all belong to the rule being filtered, and Clint and I thought that this was the best way of making that obvious. |
|
@elasticmachine retest this please |
1 similar comment
|
@elasticmachine retest this please |
jpountz
left a comment
There was a problem hiding this comment.
Thanks @romseygeek. I thought having some sort of filtering intervals would be more consistent with how we deal with queries but I don't feel too strongly about it either.
LGTM
| /** | ||
| * Constructs an IntervalsSource based on analyzed text | ||
| */ | ||
| public class IntervalBuilder { |
There was a problem hiding this comment.
should we move it to the same package to keep visibility to a minimum?
| The `or` rule will match any of its nested sub-rules. | ||
|
|
||
| [horizontal] | ||
| `intervals`:: |
@jpountz I didn't have strong reasons for adding filtering the way it is here. Happy to discuss if you think it makes sense doing it in a different way. |
This commit exposes the lucene intervals query in elasticsearch, as a replacement for the
Span query family. Instead of building query structures up from individual terms, the intervals query
uses a tree of sources, the leaves of which are formed of
match-type queries that are passedthrough analysis. These can then be combined using
or,combineorrelatesources to testfor proximity or position.
Replaces #32406.
Closes #29636