ESQL: Block loader for pushing LENGTH#137217
Conversation
Creates a `BlockLoader` for pushing the `LENGTH` function down into the loader for `keyword` fields. It takes advantage of the terms dictionary so we only need to calculate the code point count once per unique term loaded. This `BlockLoader` implementation isn't plugged into the infrastructure for emitting it because we're waiting on the infrastructure we've started in elastic#137002. We'll make a follow up PR to plug this in. We're doing this mostly to demonstrate another function that we can push into field loading, in addition to the vector similarity functions we're building in elastic#137002. We don't expect `LENGTH` to be a super hot function. If it happens to be then this'll help. Before we plug this in we'll have to figure out emitting warnings from functions that we've fused to field loading. Because `LENGTH` can emit a warning, specifically when it hits a multivalued field.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
dnhatn
left a comment
There was a problem hiding this comment.
In cases with large indices and many ordinals - such as an index with 10M documents and 10K ordinals - it might be more efficient to look up ordinals in order. However, this isn't a big concern. This looks great. Thank you, Nik!
I think what you are saying is "this is fine, but the |
Creates a `BlockLoader` for pushing the `LENGTH` function down into the loader for `keyword` fields. It takes advantage of the terms dictionary so we only need to calculate the code point count once per unique term loaded. This `BlockLoader` implementation isn't plugged into the infrastructure for emitting it because we're waiting on the infrastructure we've started in elastic#137002. We'll make a follow up PR to plug this in. We're doing this mostly to demonstrate another function that we can push into field loading, in addition to the vector similarity functions we're building in elastic#137002. We don't expect `LENGTH` to be a super hot function. If it happens to be then this'll help. Before we plug this in we'll have to figure out emitting warnings from functions that we've fused to field loading. Because `LENGTH` can emit a warning, specifically when it hits a multivalued field.
Creates a
BlockLoaderfor pushing theLENGTHfunction down into the loader forkeywordfields. It takes advantage of the terms dictionary so we only need to calculate the code point count once per unique term loaded.This
BlockLoaderimplementation isn't plugged into the infrastructure for emitting it because we're waiting on the infrastructure we've started in #137002. We'll make a follow up PR to plug this in.We're doing this mostly to demonstrate another function that we can push into field loading, in addition to the vector similarity functions we're building in #137002. We don't expect
LENGTHto be a super hot function. If it happens to be then this'll help.Before we plug this in we'll have to figure out emitting warnings from functions that we've fused to field loading. Because
LENGTHcan emit a warning, specifically when it hits a multivalued field.