Search in ES|QL: Add MATCH operator#110971
Conversation
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/ExpressionBuilder.java
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Verifier.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Verifier.java
Show resolved
Hide resolved
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java
Show resolved
Hide resolved
...ql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
Show resolved
Hide resolved
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
...rc/internalClusterTest/java/org/elasticsearch/xpack/esql/plugin/MatchOperatorPushDownIT.java
Outdated
Show resolved
Hide resolved
...rc/internalClusterTest/java/org/elasticsearch/xpack/esql/plugin/MatchOperatorPushDownIT.java
Outdated
Show resolved
Hide resolved
| """; | ||
|
|
||
| var error = expectThrows(VerificationException.class, () -> run(query)); | ||
| assertThat(error.getMessage(), containsString("Unknown column [something]")); |
There was a problem hiding this comment.
This doesn't feel like and IT on the pushdown - more like on MATCH itself. I might rename the class.
If you want to test the pushdown I might try using the profile option on the request and looking for the LuceneSourceOperator's status. It should have a list of queries it run and that should tell you that you pushed down.
I think you could do all of this with an ESRestTestCase if you wanted which gives you the advantage of being able to run it against many versions later.
For some of these failures I think I'd prefer them in the yaml test cases. They run against multiple versions by default and they have to be pretty simple so it can be easier to read them. Just because you know they can't surprise you.
There was a problem hiding this comment.
I would, also, be ok with the tests in VerifierTests (the VerificationException ones).
There was a problem hiding this comment.
I added yaml tests as well - I liked these these tests because I can easily enable logging and debug in the IDE - I find it easier to do that here than in yaml or CSV specs.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Verifier.java
Show resolved
Hide resolved
|
Hi @ioanatia, I've created a changelog YAML for you. |
astefan
left a comment
There was a problem hiding this comment.
Even if it takes longer, I would like this feature to be merged with csv-spec tests as well. It's something we usually do with features, bug-fixes, enhancements in ESQL.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Verifier.java
Outdated
Show resolved
Hide resolved
| """; | ||
|
|
||
| var error = expectThrows(VerificationException.class, () -> run(query)); | ||
| assertThat(error.getMessage(), containsString("Unknown column [something]")); |
There was a problem hiding this comment.
I would, also, be ok with the tests in VerifierTests (the VerificationException ones).
I think we only have two options for csv-spec tests:
I opened #111157 to see if we could run these at runtime, but that's going to take some time to review and integrate into this. Also, it still needs a lucene index. I think we probably should skip the csv-spec tests in |
nik9000
left a comment
There was a problem hiding this comment.
If we want CSV tests, I think this is the shortest route to them: https://gist.github.com/nik9000/38946d2ba2639db64fe5ee28d0b805a7
It adds a capability that's not supported in older versions and so the tests can be skipped.
| - "No limit defined, adding default limit of \\[.*\\]" | ||
| esql.query: | ||
| body: | ||
| query: 'FROM test | WHERE content MATCH "fox" | KEEP id | SORT id' |
There was a problem hiding this comment.
This is going to fail without a capability, I think.
There was a problem hiding this comment.
Wait. You check the cluster feature already. I think we're better off checking capabilities. We're just adding too many things to do it all with features.
There was a problem hiding this comment.
This has been updated to use a capability. 👍
Wait. Ignore this. you did it. |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/EsqlFeatures.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/qa/testFixtures/src/main/resources/where-match.csv-spec
Show resolved
Hide resolved
| - "No limit defined, adding default limit of \\[.*\\]" | ||
| esql.query: | ||
| body: | ||
| query: 'FROM test | WHERE content MATCH "fox" | KEEP id | SORT id' |
There was a problem hiding this comment.
Wait. You check the cluster feature already. I think we're better off checking capabilities. We're just adding too many things to do it all with features.
|
|
||
| - match: { status: 400 } | ||
| - match: { error.type: verification_exception } | ||
| - match: { error.reason: "Found 1 problem\nline 1:19: MATCH requires a text or keyword field, but [id] has type [integer]" } |
There was a problem hiding this comment.
Checking my understanding - do we want MATCH to skip non-string fields in the future? Like, if I do WHERE int MATCH "1" should that fail? Or match nothing? Or match the literal 1? It's all good to match whatever makes sense in this PR though.
There was a problem hiding this comment.
Our aim is to add full text search capabilities and not necessarily have full parity with the match query - this is why I restricted this to text and keyword for now - if we want to add support for data, boolean and numeric fields we can definitely do that later. not in this PR though
There was a problem hiding this comment.
Matching on keyword and text seems reasonable. It's not clear to me how much value there is matching on more field types, but I agree that that can be discussed separately.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
Outdated
Show resolved
Hide resolved
|
leaving this open for a while longer unless there are any other comments - otherwise I will be merging this later today. |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
Outdated
Show resolved
Hide resolved
…action/EsqlCapabilities.java Co-authored-by: Iván Cea Fontenla <ivancea96@outlook.com>
Part of #110131
Adds a
MATCHoperator, e.g.:A few notes:
I did not add csv tests because the push down of the match predicate to Lucene is done withAndrei pointed out we have 2 different types of CSV tests - tracking this as a follow up in Full-text search in ES|QL #110131LocalPhysicalPlanOptimizerand csv tests do no execute the rules fromLocalPhysicalPlanOptimizer. To test the push down ofMATCHwe used integration tests inMatchOperatorPushDownITMATCHand catch them early at theVerifierlevel so that we can return aVerificationExceptionrather than have this fail at the compute engine level.LocalPhysicalPlanOptimizerMatchOperatorPushDownITand marked with aTODOso we don't forget about it. Full-text search in ES|QL #110131 has a separate item to ensure that we have graceful failure modes for both theMATCHoperator andMATCHcommand