Support arithmetic operations for dense_vectors: scalar version#141060
Support arithmetic operations for dense_vectors: scalar version#141060kkharbas merged 13 commits intoelastic:mainfrom
Conversation
carlosdelest
left a comment
There was a problem hiding this comment.
Hey @kkharbas ! I don't think we need to have a
dense_vector + field
so we operate on each dense_vector field over each scalar field on the other column, but:
dense_vector + scalar constant
so we use the same scalar constant to operate on every dense_vector field row.
So, we could do something like:
EVAL my_negated_vectors = my_vector_field * -1
or
EVAL my_added_vectors = my_vector_field + 3.0
Right now I don't see the use case for operating on a row by row basis for scalars - and I think in any case using a single scalar constant would be a more frequent use case, and simpler to implement for now.
We may come back to this implementation, but I think for now we should reduce the scope and look for single numeric constants to be applied.
What do you think?
That makes total sense. I was just focussing on completeness of the operations but usability wise its fair to only support |
685c7a9 to
834ba7d
Compare
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
834ba7d to
f70c477
Compare
carlosdelest
left a comment
There was a problem hiding this comment.
I have a question on the approach - instead of dealing separately with scalar operands in a separate class, convert them via an ExpressionEvaluator to dense vectors so we can operate on them with the current infra.
...arch/xpack/esql/expression/predicate/operator/arithmetic/DenseVectorArithmeticOperation.java
Outdated
Show resolved
Hide resolved
...arch/xpack/esql/expression/predicate/operator/arithmetic/DenseVectorArithmeticOperation.java
Show resolved
Hide resolved
| * {@link EvalOperator.ExpressionEvaluator} implementation for performing arithmetic operations on lhs=double and rhs=dense_vector arguments. | ||
| * | ||
| */ | ||
| class DoubleDenseVectorOpEvaluator implements EvalOperator.ExpressionEvaluator { |
There was a problem hiding this comment.
Having a specific class for handling operating with doubles looks like something we should try to avoid.
I wonder about creating a ToFloat internal function for dealing with this, similar to other conversion functions like ToDouble. This would allow us to use ToFloat as part of the evaluator chain that the factory creates when we find a numeric type that we want to convert to floats, and then deal with doing float to float operations.
WDYT?
| * {@link EvalOperator.ExpressionEvaluator} implementation for performing arithmetic operations on lhs=dense_vector and rhs=double argument. | ||
| * | ||
| */ | ||
| class DenseVectorDoubleOpEvaluator implements EvalOperator.ExpressionEvaluator { |
There was a problem hiding this comment.
I wonder if we can avoid having a specific evaluator for doubles, and adding a specific DenseVectorBinaryEvaluator.
I was thinking on having a ExpressionEvaluator that would convert a numeric block into a dense_vector. That way, we could wrap the numeric block using this converter before it is processed by the DenseVectorsEvaluator, and thus reducing the problem to an operation on two dense vectors which we already have solved.
WDYT?
I had given this approach some consideration but ran into hiccups. I will take a fresh look again at this approach. |
I tried out this approach of using existing conversion evaluators, e.g. Few thoughts and questions on this,
|
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
test this please |
|
Hi @kkharbas, I've created a changelog YAML for you. |
|
@carlosdelest the PR is ready for another look. Simplified the implementation based on offline discussion. Thank you |
carlosdelest
left a comment
There was a problem hiding this comment.
LGTM, great work @kkharbas !
Thanks for iterating on this one ❤️
...arch/xpack/esql/expression/predicate/operator/arithmetic/DenseVectorArithmeticOperation.java
Show resolved
Hide resolved
...icsearch/xpack/esql/expression/predicate/operator/arithmetic/DenseVectorScalarEvaluator.java
Outdated
Show resolved
Hide resolved
Adds support for arithmetic operations - '+', '-', '*' and '/' when one operands is a dense_vector and other is scalar. Performs the arithmetic operation on each element of the dense_vector against the scalar operand. Closes elastic#140538
2e234fe to
a6ee999
Compare
…tic#141060) Support arithmetic operations for dense_vectors: scalar version Adds support for arithmetic operations - '+', '-', '*' and '/' when one operands is a dense_vector and other is scalar. Performs the arithmetic operation on each element of the dense_vector against the scalar operand. Closes elastic#140538 --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…on-sliced-reindex * upstream/main: Update docs for v9.3.1 release (elastic#142887) Update docs for v9.2.6 release (elastic#142888) Improves visibility of vector index options and inference configuration (elastic#141653) Disable CAE in microsoft-graph-authz plugin (elastic#142848) Small improvements to `GetSnapshotsIT#testAllFeatures` (elastic#142825) Fix IndexSettingsTests synthetic ID tests (elastic#142654) [Test] Unmute tests of SnapshotShutdownIT (elastic#142921) Fixing metrics_info.json kibana definition file name (elastic#142813) [Packaging] Disable glibc 2.43 malloc huge pages in Wolfi images (elastic#142894) Mute org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshotsTSDBSyntheticIdIntegTests testSearchableSnapshot elastic#142918 Add shard heap usage to ClusterInfo (elastic#139557) ESQL: Load script fields row-by-row (elastic#142807) ESQL: Consolidate doc values memory tracking (elastic#142816) ES-14124 Create Index Count Limit User documentation Page (elastic#142570) Add a es819 codec test to verify tryRead returns null if may contain duplicates (elastic#142409) Support arithmetic operations for dense_vectors: scalar version (elastic#141060) [Transform] Allow project_routing (elastic#142421) Refactor query rewrite async actions for knn and sparse_vector queries (elastic#142889) Do not mark bulk indexing requests as retried after primary relocations (elastic#142157)
…tic#141060) Support arithmetic operations for dense_vectors: scalar version Adds support for arithmetic operations - '+', '-', '*' and '/' when one operands is a dense_vector and other is scalar. Performs the arithmetic operation on each element of the dense_vector against the scalar operand. Closes elastic#140538 --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Adds support for arithmetic operations - '+', '-', '*' and '/' when one operand is a dense_vector and other is scalar.
Performs the arithmetic operation on each element of the dense_vector against the scalar operand. The scalar operand should be a constant value, scalar fields are not supported.
✔️
dense_vector + scalar constant❌
dense_vector + scalar fieldCloses #140538