ESQL: Add `ExpressionEvaluator` for Lucene `Query` by nik9000 · Pull Request #111157 · elastic/elasticsearch

nik9000 · 2024-07-22T13:31:22Z

I was talking with @ioanatia on Friday about building an ExpressionEvaluator that could run a Lucene Query during the compute engine's normal runtime. It sounded fun so I took a crack at it. It's not finished or plugged in, but I think something like this would be useful to build on.

The idea here is that, for stuff like "this text field matches this string" AKA WHERE title MATCH "harry potter", we push it to Lucene where possible, but we don't have to. With this handy tool! That lines up better with the way ESQL works in general. It makes planning simpler if you can fall back on "doing it at runtime".

Now, running a lucene query at runtime isn't ideal. In the worst case we're running a MatchAll query to iterate everything and then running this query, block by block.

@ioanatia

I was talking with @ioanatia on Friday about building an `ExpressionEvaluator` that could run a Lucene `Query` during the compute engine's normal runtime. It sounded fun so I took a crack at it. It's not finished or plugged in, but I think something like this would be useful to build on. The idea here is that, for stuff like "this text field matches this string" AKA `WHERE title MATCH "harry potter"`, we push it to Lucene where possible, but we don't *have* to. With this handy tool! That lines up better with the way ESQL works in general. It makes planning simpler if you can fall back on "doing it at runtime". Now, running a lucene query at runtime isn't ideal. In the worst case we're running a `MatchAll` query to iterate everything and then running this query, block by block.

elasticsearchmachine · 2024-07-22T13:31:52Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 · 2024-07-22T16:01:18Z

...l/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneQueryExpressionEvaluator.java

+ * {@link LuceneSourceOperator} or the like, but sometimes this isn't possible. So
+ * this evaluator is here to save the day.
+ */
+public class LuceneQueryExpressionEvaluator implements EvalOperator.ExpressionEvaluator {


I'm having this just check for matching rather than scoring. I think scoring is something we'd want to think about later. Probably similar to this code though.

Match or No Match is incredibly useful, as is. Scoring can come later and separately.

ChrisHegarty

LGTM

ChrisHegarty · 2024-07-25T08:34:59Z

...l/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneQueryExpressionEvaluator.java

+    /**
+     * Collects matching information for dense range of doc ids. This assumes that
+     * doc ids are sent to {@link LeafCollector#collect(int)} in ascending order
+     * which isn't documented, but @jpountz swears is true.


... then it MUST be true :-)

ChrisHegarty · 2024-07-25T08:37:06Z

...lugin/esql/compute/src/test/java/org/elasticsearch/compute/operator/ShuffleDocsOperator.java

+
+import static org.apache.lucene.tests.util.LuceneTestCase.random;
+
+public class ShuffleDocsOperator extends AbstractPageMappingOperator {


ChrisHegarty · 2024-07-25T08:46:19Z

...l/compute/src/main/java/org/elasticsearch/compute/lucene/LuceneQueryExpressionEvaluator.java

+ * {@link LuceneSourceOperator} or the like, but sometimes this isn't possible. So
+ * this evaluator is here to save the day.
+ */
+public class LuceneQueryExpressionEvaluator implements EvalOperator.ExpressionEvaluator {


Match or No Match is incredibly useful, as is. Scoring can come later and separately.

ioanatia

Great work - you took a half baked idea and made it into something concrete.
I'd like to get this in because my goal is to try and use this with the match operator once we have the match operator PR merged.
Nothing at the moment is using LuceneQueryExpressionEvaluator - I see no harm in merging this PR, especially since we are planning to use it later on.

nik9000 · 2024-07-25T12:27:47Z

I'd like to get this in because my goal is to try and use this with the match operator once we have the match operator PR merged.

OK! I'll get CI happy and get this in today.

nik9000 · 2024-07-25T14:01:52Z

OK! I'll get CI happy and get this in today.

Enjoy!

I don't envy the planning work on this one. I don't know precisely how to make it go, but it's got something to do with making sure we run this before the exchange. And before dropping _doc.

nik9000 added >non-issue :Analytics/ES|QL AKA ESQL v8.16.0 labels Jul 22, 2024

nik9000 requested a review from ioanatia July 22, 2024 13:31

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 22, 2024

nik9000 added 4 commits July 22, 2024 09:33

Doc

2ae6550

Docs

e8e8903

Merge branch 'main' into esql_lucene_runtime

d52c5d0

Shuffle ok

5d25be9

nik9000 commented Jul 22, 2024

View reviewed changes

Dos

3abdf35

nik9000 mentioned this pull request Jul 22, 2024

Search in ES|QL: Add MATCH operator #110971

Merged

nik9000 added 2 commits July 24, 2024 08:18

Merge branch 'main' into esql_lucene_runtime

782999f

Docs

99fd266

ChrisHegarty approved these changes Jul 25, 2024

View reviewed changes

ioanatia approved these changes Jul 25, 2024

View reviewed changes

Finish nocommit

b3e8c89

nik9000 added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jul 25, 2024

elasticsearchmachine merged commit 4c6f37d into elastic:main Jul 25, 2024

nik9000 deleted the esql_lucene_runtime branch July 25, 2024 13:47

ioanatia mentioned this pull request Oct 2, 2024

Compute engine evaluation for full text functions #113938

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Add `ExpressionEvaluator` for Lucene `Query`#111157

ESQL: Add `ExpressionEvaluator` for Lucene `Query`#111157
elasticsearchmachine merged 9 commits intoelastic:mainfrom
nik9000:esql_lucene_runtime

nik9000 commented Jul 22, 2024

Uh oh!

elasticsearchmachine commented Jul 22, 2024

Uh oh!

nik9000 Jul 22, 2024

Uh oh!

ChrisHegarty Jul 25, 2024

Uh oh!

ChrisHegarty left a comment

Uh oh!

ChrisHegarty Jul 25, 2024

Uh oh!

ChrisHegarty Jul 25, 2024

Uh oh!

ChrisHegarty Jul 25, 2024

Uh oh!

ioanatia left a comment •

edited

Loading

Uh oh!

nik9000 commented Jul 25, 2024

Uh oh!

nik9000 commented Jul 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		import static org.apache.lucene.tests.util.LuceneTestCase.random;

		public class ShuffleDocsOperator extends AbstractPageMappingOperator {

Conversation

nik9000 commented Jul 22, 2024

Uh oh!

elasticsearchmachine commented Jul 22, 2024

Uh oh!

nik9000 Jul 22, 2024

Choose a reason for hiding this comment

Uh oh!

ChrisHegarty Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

ChrisHegarty left a comment

Choose a reason for hiding this comment

Uh oh!

ChrisHegarty Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

ChrisHegarty Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

ChrisHegarty Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

ioanatia left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Jul 25, 2024

Uh oh!

nik9000 commented Jul 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ioanatia left a comment •

edited

Loading