Fix Synthetic Source Handling for bit Type in dense_vector Field#114407
Fix Synthetic Source Handling for bit Type in dense_vector Field#114407benwtrent merged 10 commits intoelastic:mainfrom
bit Type in dense_vector Field#114407Conversation
… `index` is set to `false` (elastic#114402)
|
Could you please take a look at this PR? I would greatly appreciate your review and feedback. Best regards, |
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
@Rassyan thank you for the PR, a couple of things:
|
|
Thank you for the feedback. I will start working on the changes immediately.
Could you please point me to specific examples of yaml tests that demonstrate this? Thank you again for your guidance. Best regards, Rassyan |
|
@Rassyan search for something like: In |
...-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/45_knn_search_bit.yml
Show resolved
Hide resolved
|
buildkite test this please |
benwtrent
left a comment
There was a problem hiding this comment.
I think this is looking good. Could we add some more testing lower down in the mapper testing? The yaml test and the fix looks good otherwise.
...-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/45_knn_search_bit.yml
Show resolved
Hide resolved
…cSourceSupport` for element_type bit
server/src/test/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapperTests.java
Outdated
Show resolved
Hide resolved
|
buildkite test this please |
…identify range check issue
server/src/test/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapperTests.java
Show resolved
Hide resolved
|
buildkite test this please |
benwtrent
left a comment
There was a problem hiding this comment.
Thank you for going through the additional testing. This fix is great :)
| List<String> order = Arrays.stream(params).map(p -> p.name).toList(); | ||
| Iterator<Map.Entry<String, Object>> iterator = fieldNode.entrySet().stream().sorted((e1, e2) -> { | ||
| int i1 = order.indexOf(e1.getKey()); | ||
| int i2 = order.indexOf(e2.getKey()); | ||
| return Integer.compare(i1, i2); | ||
| }).iterator(); | ||
| while (iterator.hasNext()) { |
There was a problem hiding this comment.
I am against this, let's remove to try and find a better fix if we still run into weird parsing ordering failures.
It would be better to update the dense mapper validations than to fundamentally change how all mappers iterate values.
There was a problem hiding this comment.
OK, I understand your concerns. I will move the commits related to the mapper bug to a separate PR for further discussion. This PR will remain focused solely on fixing the Synthetic Source issue. Does that sound good to you?
There was a problem hiding this comment.
I will move the commits related to the mapper bug to a separate PR for further discussion.
thank you
0f5ca00 to
2473b13
Compare
|
@Rassyan we should be able to update testing here like before without any of the parsing changes. Please revert back to your testing updates, including my updates to synthetic source so we can figure out the testing failures there. |
|
buildkite test this please |
|
@Rassyan I have updated the test, I verified it all passes ok, even for 1k dimensions and bits. I think this is good :). I will merge when green. |
|
buildkite test this please |
…lastic#114407) **Description:** This PR addresses the issue described in [elastic#114402](elastic#114402), where the `synthetic_source` feature does not correctly handle the `bit` type in `dense_vector` fields when `index` is set to `false`. The root cause of the issue was that the `bit` type was not properly accounted for, leading to an array that is 8 times the size of the actual `dims` value of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document. **Changes:** - Adjusted the `synthetic_source` logic to correctly handle the `bit` type by ensuring the array size accounts for the 8x difference in dimensions. - Added yaml test to cover the `bit` type scenario in `dense_vector` fields with `index` set to `false`. **Related Issues:** - Closes [elastic#114402](elastic#114402) - Introduced in [elastic#110059](elastic#110059)
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…lastic#114407) **Description:** This PR addresses the issue described in [elastic#114402](elastic#114402), where the `synthetic_source` feature does not correctly handle the `bit` type in `dense_vector` fields when `index` is set to `false`. The root cause of the issue was that the `bit` type was not properly accounted for, leading to an array that is 8 times the size of the actual `dims` value of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document. **Changes:** - Adjusted the `synthetic_source` logic to correctly handle the `bit` type by ensuring the array size accounts for the 8x difference in dimensions. - Added yaml test to cover the `bit` type scenario in `dense_vector` fields with `index` set to `false`. **Related Issues:** - Closes [elastic#114402](elastic#114402) - Introduced in [elastic#110059](elastic#110059) (cherry picked from commit 465c65c)
…114407) (#114756) **Description:** This PR addresses the issue described in [#114402](#114402), where the `synthetic_source` feature does not correctly handle the `bit` type in `dense_vector` fields when `index` is set to `false`. The root cause of the issue was that the `bit` type was not properly accounted for, leading to an array that is 8 times the size of the actual `dims` value of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document. **Changes:** - Adjusted the `synthetic_source` logic to correctly handle the `bit` type by ensuring the array size accounts for the 8x difference in dimensions. - Added yaml test to cover the `bit` type scenario in `dense_vector` fields with `index` set to `false`. **Related Issues:** - Closes [#114402](#114402) - Introduced in [#110059](#110059) Co-authored-by: Rassyan <yjkhngds@gmail.com>
…lastic#114407) **Description:** This PR addresses the issue described in [elastic#114402](elastic#114402), where the `synthetic_source` feature does not correctly handle the `bit` type in `dense_vector` fields when `index` is set to `false`. The root cause of the issue was that the `bit` type was not properly accounted for, leading to an array that is 8 times the size of the actual `dims` value of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document. **Changes:** - Adjusted the `synthetic_source` logic to correctly handle the `bit` type by ensuring the array size accounts for the 8x difference in dimensions. - Added yaml test to cover the `bit` type scenario in `dense_vector` fields with `index` set to `false`. **Related Issues:** - Closes [elastic#114402](elastic#114402) - Introduced in [elastic#110059](elastic#110059)
…114407) **Description:** This PR addresses the issue described in [#114402](#114402), where the `synthetic_source` feature does not correctly handle the `bit` type in `dense_vector` fields when `index` is set to `false`. The root cause of the issue was that the `bit` type was not properly accounted for, leading to an array that is 8 times the size of the actual `dims` value of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document. **Changes:** - Adjusted the `synthetic_source` logic to correctly handle the `bit` type by ensuring the array size accounts for the 8x difference in dimensions. - Added yaml test to cover the `bit` type scenario in `dense_vector` fields with `index` set to `false`. **Related Issues:** - Closes [#114402](#114402) - Introduced in [#110059](#110059)
…ld (#114407) (#114759) * Fix Synthetic Source Handling for `bit` Type in `dense_vector` Field (#114407) **Description:** This PR addresses the issue described in [#114402](#114402), where the `synthetic_source` feature does not correctly handle the `bit` type in `dense_vector` fields when `index` is set to `false`. The root cause of the issue was that the `bit` type was not properly accounted for, leading to an array that is 8 times the size of the actual `dims` value of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document. **Changes:** - Adjusted the `synthetic_source` logic to correctly handle the `bit` type by ensuring the array size accounts for the 8x difference in dimensions. - Added yaml test to cover the `bit` type scenario in `dense_vector` fields with `index` set to `false`. **Related Issues:** - Closes [#114402](#114402) - Introduced in [#110059](#110059) (cherry picked from commit 465c65c) * fixing backport of search capabilities * fixing license header * adding capabilities to RestSearchAction * fixing backport * spotless * muting teset for ccs * adding capabilities to the ccs test runner --------- Co-authored-by: Rassyan <yjkhngds@gmail.com>
…lastic#114407) **Description:** This PR addresses the issue described in [elastic#114402](elastic#114402), where the `synthetic_source` feature does not correctly handle the `bit` type in `dense_vector` fields when `index` is set to `false`. The root cause of the issue was that the `bit` type was not properly accounted for, leading to an array that is 8 times the size of the actual `dims` value of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document. **Changes:** - Adjusted the `synthetic_source` logic to correctly handle the `bit` type by ensuring the array size accounts for the 8x difference in dimensions. - Added yaml test to cover the `bit` type scenario in `dense_vector` fields with `index` set to `false`. **Related Issues:** - Closes [elastic#114402](elastic#114402) - Introduced in [elastic#110059](elastic#110059)
Description:
This PR addresses the issue described in #114402, where the
synthetic_sourcefeature does not correctly handle thebittype indense_vectorfields whenindexis set tofalse. The root cause of the issue was that thebittype was not properly accounted for, leading to an array that is 8 times the size of the actualdimsvalue of docvalue. This mismatch will causes an array out-of-bounds exception when reconstructing the document.Changes:
synthetic_sourcelogic to correctly handle thebittype by ensuring the array size accounts for the 8x difference in dimensions.bittype scenario indense_vectorfields withindexset tofalse.Related Issues: