Skip to content

ESQL: Begin optimizing Block#lookup#108482

Merged
nik9000 merged 3 commits intoelastic:mainfrom
nik9000:esql_lookup_vector
May 10, 2024
Merged

ESQL: Begin optimizing Block#lookup#108482
nik9000 merged 3 commits intoelastic:mainfrom
nik9000:esql_lookup_vector

Conversation

@nik9000
Copy link
Copy Markdown
Member

@nik9000 nik9000 commented May 9, 2024

This creates the infrastructure to allow optimizing the lookup method when applied to Vectors and then implements that optimization for constant vectors. Constant vectors now take one of six paths:

  1. An empty positions Block yields an empty result set.
  2. If positions is a Block, perform the un-optimized lookup.
  3. If the min of the positions Vector is less that 0 then throw an exception.
  4. If the min of the positions Vector is greater than the number of positions in the lookup block then return a single ConstantNullBlock because you are looking up outside the range.
  5. If the max of the positions Vector is less than the number of positions in the lookup block then return a Constant$Type$Block with the same value as the lookup block. This is a lookup that's entirely within range.
  6. Otherwise return the unoptimized lookup.

This is fairly simple but demonstrates how we can plug in more complex optimizations later.

This creates the infrastructure to allow optimizing the `lookup` method
when applied to `Vector`s and then implements that optimization for
constant vectors. Constant vectors now take one of six paths:
1. An empty positions `Block` yields an empty result set.
2. If `positions` is a `Block`, perform the un-optimized lookup.
3. If the `min` of the `positions` *Vector* is less that 0 then throw an
   exception.
4. If the `min` of the positions Vector is greater than the number of
   positions in the lookup block then return a single
   `ConstantNullBlock` because you are looking up outside the range.
5. If the `max` of the positions Vector is less than the number of
   positions in the lookup block then return a `Constant$Type$Block`
   with the same value as the lookup block. This is a lookup that's
   entirely within range.
6. Otherwise return the unoptimized lookup.

This is *fairly* simple but demonstrates how we can plug in more complex
optimizations later.
@nik9000 nik9000 requested a review from dnhatn May 9, 2024 18:48
@nik9000 nik9000 requested a review from a team as a code owner May 9, 2024 18:48
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 9, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Copy Markdown
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core lib change looks good to me, one small suggestion.


@Override
public T next() {
return null;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could assert false here? next should never be called in this case

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Copy Markdown
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Nik!

@nik9000 nik9000 merged commit 04d3b99 into elastic:main May 10, 2024
elasticsearchmachine pushed a commit that referenced this pull request Jun 7, 2024
This adds support for `LOOKUP`, a command that implements a sort of
inline `ENRICH`, using data that is passed in the request:

```
$ curl -uelastic:password -HContent-Type:application/json -XPOST \
    'localhost:9200/_query?error_trace&pretty&format=txt' \
-d'{
    "query": "ROW a=1::LONG | LOOKUP t ON a",
    "tables": {
        "t": {
            "a:long":     [    1,     4,     2],
            "v1:integer": [   10,    11,    12],
            "v2:keyword": ["cat", "dog", "wow"]
        }
    },
    "version": "2024.04.01"
}'
      v1       |      v2       |       a       
---------------+---------------+---------------
10             |cat            |1
```

This required these PRs: * #107624 * #107634 * #107701 * #107762 *
#107923 * #107894 * #107982 * #108012 * #108020 * #108169 * #108191 *
#108334 * #108482 * #108696 * #109040 * #109045

Closes #107306
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants