fix: correct decimal detection in range filtering issue#17247#17372
fix: correct decimal detection in range filtering issue#17247#17372DanielSLDev wants to merge 2 commits intoopensearch-project:mainfrom
Conversation
Previously, hasDecimalPart(parsedLow) and hasDecimalPart(parsedHigh) were used after converting values to Long, causing loss of decimal precision and incorrect range adjustments. Now, hasDecimalPart(rawLow) and hasDecimalPart(rawHigh) are used instead, ensuring proper detection before conversion and preventing off-by-one errors in range filters. Signed-off-by: Daniel Salas <daniel.salas@encora.com>
|
❕ Gradle check result for 3791e74: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #17372 +/- ##
============================================
+ Coverage 72.34% 72.44% +0.10%
- Complexity 65480 65552 +72
============================================
Files 5291 5291
Lines 304313 304318 +5
Branches 44176 44176
============================================
+ Hits 220142 220453 +311
+ Misses 66133 65767 -366
- Partials 18038 18098 +60 ☔ View full report in Codecov by Sentry. |
|
Thanks for opening the PR @DanielSEncora Please add a UT under |
…ecimals validates the behavior of the StarTree index when it works with decimal values (both double and float types). testStarTreeDocValuesDecimalPrecision validates that the StarTree index maintains decimal precision, particularly with high precision values like 10.9999 for double and 99.999f for float. Signed-off-by: Daniel Salas <daniel.salas@encora.com>
|
Hi @expani, the requested UTs have been added! Let me know if any other changes need to be made. |
|
❌ Gradle check result for 264a58c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
|
||
| public void testStarTreeDocValuesWithDecimals() throws IOException { | ||
| final List<DimensionFieldData> dimensionFieldDatum = List.of( | ||
| new DimensionFieldData("double_field", () -> 10.75, DimensionTypes.DOUBLE), |
There was a problem hiding this comment.
We can generate a random float having precision and change it to int_field and long_field essentially non-decimal numeric fields.
There was a problem hiding this comment.
Hi @expani , my apologies. Could you elaborate on what changes should be made to my contribution?
There was a problem hiding this comment.
@DanielSLDev I should be the one apologizing for not elaborating on the issue before :)
Here it goes :
Lets consider the below aggregation with a filter query
curl -XPOST -H'Content-type: application/json' -kv localhost:9200/logs/_search -d '{
"size": 0,
"query": {
"term": {
"port": 9200
}
},
"aggs": {
"sum_request_size": {
"sum": {
"field": "size"
}
}
}
}'
In this case, port is an integer field and is being passed as an integer by the user. All good here.
However, OpenSearch also allows user to pass floating point number instead of integer in the search query. In such cases, we round off to an integer before executing the query.
curl -XPOST -H'Content-type: application/json' -kv localhost:9200/logs/_search -d '{
"size": 0,
"query": {
"term": {
"port": 9200.65
}
},
"aggs": {
"sum_request_size": {
"sum": {
"field": "size"
}
}
}
}'
We were not aware of this feature which caused the bug of not rounding off the user input when Star Tree gets used for the aggregation.
This is applicable for all non-decimal fields like integer, long, short, byte; etc.
So, I suggested the change of converting
new DimensionFieldData("double_field", () -> 10.75, DimensionTypes.DOUBLE),
new DimensionFieldData("float_field", () -> 5.25f, DimensionTypes.FLOAT)
to
new DimensionFieldData("integer_field", () -> randomFloat(), DimensionTypes.INTEGER),
new DimensionFieldData("long_field", () -> randomFloat(), DimensionTypes.LONG)
Also, randomFloat would make the test cover more cases than hardcoding to a specific value.
|
Hi @DanielSLDev, I see that you have reverted the fix in your latest commit - You can proceed with fix, I have reverted the fix in my PR - link |
|
This PR is stalled because it has been open for 30 days with no activity. |
|
@Shailesh-Kumar-Singh @DanielSLDev - trying to figure out if we are trying to do anything in this PR as I only see unit tests in the code changes. |
Description
Previously, hasDecimalPart(parsedLow) and hasDecimalPart(parsedHigh) were used after converting values to Long, causing loss of decimal precision and incorrect range adjustments.
Now, hasDecimalPart(rawLow) and hasDecimalPart(rawHigh) are used instead, ensuring proper detection before conversion and preventing off-by-one errors in range filters.
Related Issues
Resolves #17247
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.