Support match_phrase filter function in SQL and PPL#604
Support match_phrase filter function in SQL and PPL#604joshuali925 merged 58 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
…ents. Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Added a couple test for match_test in PPL. The tests are currently @ignore'd until more of the issue is complete. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Create a parametrized test for PPLSyntax parser. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
…for `MATCH` are also updated. Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
1. Use PPL instead of SQL as samples. 2. Use data that doctest runs with. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
… samples. This fixes doctest. Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Signed-off-by: Yury Fridlyand <yuryf@bitquilltech.com>
Codecov Report
@@ Coverage Diff @@
## main #604 +/- ##
=============================================
- Coverage 94.59% 62.76% -31.83%
=============================================
Files 276 10 -266
Lines 7457 658 -6799
Branches 552 119 -433
=============================================
- Hits 7054 413 -6641
+ Misses 349 192 -157
+ Partials 54 53 -1
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
Add support for matchphrase as an alternative spelling of match_phrase. Mention in both SQL and PPL documentation that matchphrase is accepted as well as match_phrase. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
…update-v2 Support legacy syntax for match_phrase in the new SQL engine.
| | Bond | | ||
| +------------+ | ||
|
|
||
| MATCH_PHRASE |
There was a problem hiding this comment.
not sure what should be done but should we distinguish between this one and https://github.com/opensearch-project/sql/blob/main/docs/user/beyond/fulltext.rst#match-phrase-query
There was a problem hiding this comment.
Is that match_phrase from the legacy SQL engine?
There was a problem hiding this comment.
i believe so, i'm thinking maybe add a note to the old one, or link it after "For backward compatibility" in the new one, so users know what the difference is
There was a problem hiding this comment.
@joshuali925 that's a good point.
Distinction between old and new SQL engines was explicitly removed as part of #58
I'd like to add a note to fulltext.rst that the document describes the legacy SQL engine and to refer to functions.rst for the new engine.
Once the new engine is on par with the old one, we can update fulltext.rst to describe the default engine.
What do you think?
| @@ -0,0 +1,40 @@ | |||
| package org.opensearch.sql.ppl.antlr; | |||
| @@ -0,0 +1,112 @@ | |||
| package org.opensearch.sql.opensearch.storage.script.filter.lucene; | |||
| public QueryBuilder build(FunctionExpression func) { | ||
| List<Expression> arguments = func.getArguments(); | ||
| if (arguments.size() < 2) { | ||
| throw new SemanticCheckException("match_phrase requires at least two parameters"); |
There was a problem hiding this comment.
ANTLR will check the syntax, right? If you want to assert that, it should be syntax exception.
| /** | ||
| * Lucene query that builds a match_phrase query. | ||
| */ | ||
| public class MatchPhraseQuery extends LuceneQuery { |
There was a problem hiding this comment.
Looks like MatchPhraseQuery and MatchQuery have duplicate code, could it been simplified?
There was a problem hiding this comment.
Please see RelevanceQuery added in d3667ce
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Also, use getWriteableName() to get the name of the query in the error message. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
This dummy instance of MatchPhraseQueryBuilder is used in RelevanceQuery when creating an exception to get the name of the query. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
This branch changed RelevanceQuery.build to throw SyntaxCheckException if it is called with less than two arguments. Previously, build would through SemanticCheckException. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
| List<Expression> arguments = func.getArguments(); | ||
| if (arguments.size() < 2) { | ||
| String queryName = createQueryBuilder("dummy_field", "").getWriteableName(); | ||
| throw new SyntaxCheckException( |
There was a problem hiding this comment.
add a UT to assert the exception message is expected?
| /** | ||
| * Base class for query abstraction that builds a relevance query from function expression. | ||
| */ | ||
| public abstract class RelevanceQuery<T extends QueryBuilder> extends LuceneQuery { |
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
| assertThrows(SemanticCheckException.class, () -> query.build(expr)); | ||
| SemanticCheckException exception = | ||
| assertThrows(SemanticCheckException.class, () -> query.build(expr)); | ||
| assertEquals("Parameter wrongArg is invalid for mock_query function.", exception.getMessage()); |
There was a problem hiding this comment.
I'm not sure you want to tie the test that closely to the code (so that if someone changes a single character in the string, the test fails).
Maybe you can check that the messages contains "wrongArg" and "mock_query" and leave it at that?
assertThat(exception.getMessages(), containsString("wrongArg"));
assertThat(exception.getMessages(), containsString("mock_query"));
There was a problem hiding this comment.
@acarbonetto that's in line with pre-existing convention in this codebase.
| assertThrows(SyntaxCheckException.class, () -> query.build(createCall(arguments))); | ||
| SyntaxCheckException exception = assertThrows(SyntaxCheckException.class, | ||
| () -> query.build(createCall(arguments))); | ||
| assertEquals("mock_query requires at least two parameters", exception.getMessage()); |
See opensearch-project#604 (comment) for reference Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
See opensearch-project#604 (comment) for reference Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
See opensearch-project#604 (comment) for reference Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
- Update QueryStringTest to check for SyntaxCheckException. SyntaxCheckException is correct when incorrect # of parameters See opensearch-project#604 (comment) for reference. - Introduce MultiFieldQuery and SingleFieldQuery base classes. - Extract FunctionResolver interface. FunctionResolver is now DefaultFunctionResolver. RelevanceFunctionResolver is a simplified function resolver for relevance search functions. - Removed tests from FilterQueryBuilderTest that verified exceptions thrown for invalid function calls. These scenarios are now handled by RelevanceQuery::build. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
- Update QueryStringTest to check for SyntaxCheckException. SyntaxCheckException is correct when incorrect # of parameters See #604 (comment) for reference. - Introduce MultiFieldQuery and SingleFieldQuery base classes. - Extract FunctionResolver interface. FunctionResolver is now DefaultFunctionResolver. RelevanceFunctionResolver is a simplified function resolver for relevance search functions. - Removed tests from FilterQueryBuilderTest that verified exceptions thrown for invalid function calls. These scenarios are now handled by RelevanceQuery::build. Signed-off-by: MaxKsyunz <maxk@bitquilltech.com> Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Description
match_phrasefilter function in SQL.match_phrasefilter function in PPL.matchphrasefilter function in SQL to maintain backwards compatibility with the legacy engine.Issues Resolved
Resolves #185.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.