[PoC] ESQL - Add scoring for full text functions disjunctions#121153
Closed
carlosdelest wants to merge 44 commits intoelastic:mainfrom
Closed
[PoC] ESQL - Add scoring for full text functions disjunctions#121153carlosdelest wants to merge 44 commits intoelastic:mainfrom
carlosdelest wants to merge 44 commits intoelastic:mainfrom
Conversation
…unctions-disjunctions' into non-issue/esql-full-text-functions-disjunctions
# Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/PushFiltersToSource.java
…unctions-disjunctions' into non-issue/esql-full-text-functions-disjunctions
…unctions-disjunctions' into non-issue/esql-full-text-functions-disjunctions
… vector returned pluggable
…viour and vector returned pluggable
…aluator and BooleanLogicExpressionEvaluator
Contributor
|
Documentation preview: |
Collaborator
|
Hi @carlosdelest, I've created a changelog YAML for you. |
…g-full-text-functions-disjunctions # Conflicts: # docs/reference/esql/esql-limitations.asciidoc # x-pack/plugin/build.gradle # x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneQueryExpressionEvaluator.java # x-pack/plugin/esql/compute/src/test/java/org/elasticsearch/compute/lucene/LuceneQueryExpressionEvaluatorTests.java # x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-function.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/evaluator/EvalMapper.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/evaluator/mapper/EvaluatorMapper.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/evaluator/mapper/ExpressionMapper.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/predicate/operator/comparison/InsensitiveEqualsMapper.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/LocalExecutionPlanner.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
…l-text-functions-disjunctions' into non-issue/esql-scoring-full-text-functions-disjunctions
carlosdelest
commented
Jan 29, 2025
|
|
||
| static class BooleanLogic extends ExpressionMapper<BinaryLogic> { | ||
| @Override | ||
| public ExpressionEvaluator.Factory map(FoldContext foldCtx, BinaryLogic bc, Layout layout, List<ShardContext> shardContexts) { |
Member
Author
There was a problem hiding this comment.
Refactored this logic to BooleanLogicExpressionEvaluator
Member
Author
|
Closing this in favour of #121551 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See another approach in #121322
#120291 added support for using full text functions in disjunctions, by using the
LuceneQueryExpressionEvaluatorto evaluate queries that couldn't be pushed down to Lucene at the compute engine level.The
LuceneQueryExpressionEvaluatorruns the query and outputs aBooleanVectorwithtrueorfalsedepending on whether the docs received as input have been identified as a result of the query.This provides no support for scoring, but we could adapt the
LuceneQueryExpressionEvaluatorbehaviour when scoring is used, so the query run in collects the scores and the evaluator could return aDoubleVectorwith the scores:A special negative value is returned for docs that do not match the query, so they can be later identified even if we apply boosting to the query.
The
FilterOperatorwill then receive aDoubleVectorinstead of aBooleanVector. It can check whether scoring is being used, and thus filter a DoubleVector for non-negative scores.It also will change the score block that was retrieved from Lucene, so it will use the scores retrieved from the query as the new score block. This way the
_scoremetadata will be updated.The problem is how to combine the result of the
LuceneQueryExpressionEvaluatorwith other conditions via boolean operators like AND, OR, NOT.There are new versions of operator evaluators that work with
DoubleVector, so boolean operations can combine scores:ANDsums the scores when both scores are positive, otherwise returns a negative scoreORsums the scores when one of them is positive, otherwise returns a negative scoreNOTreturns 0 when score is positive, a negative score otherwiseAs there will be non-scoring boolean expressions, we need to convert a
BooleanVectorto aDoubleVectorso results can be correctly computed. ABooleanToScoringExpressionEvaluatorclass is used to wrap evaluators used as input on conditional expressions, so they convert booleans to constant scores. They are a passthrough in case aDoubleVectorof scores is already provided.As logic for boolean logic was inlined into EvalMapper, two classes have been created to run scoring and non-scoring logic:
BooleanLogicExpressionEvaluatorandBooleanScoringLogicExpressionEvaluator.This allows combining both scored queries with boolean conditions, that will be evaluated as constant queries.