Fix flaky TPC-H Q1 test due to bugs in `MatcherUtils.closeTo()` by LantaoJin · Pull Request #5283 · opensearch-project/sql

LantaoJin · 2026-03-30T03:09:57Z

Description

Fixes intermittent failures in CalcitePPLTpchIT.testQ1 and CalcitePPLTpchPaginatingIT.testQ1 caused by two bugs in MatcherUtils.closeTo():

Incorrect element pairing via indexOf: The old code used actualValues.indexOf(v) to find each element's position, then looked up the corresponding expected value. indexOf returns the index of the first occurrence matching via equals(), so duplicate values would silently compare against the wrong expected element. This also meant no size mismatch was ever detected.
Made closeTo() ULP-aware to handle floating-point precision differences.

Related Issues

Resolves #5282

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Lantao Jin <ltjin@amazon.com>

github-actions · 2026-03-30T03:11:01Z

PR Reviewer Guide 🔍

(Review updated until commit `9357beb`)

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review Null Safety In the new loop, if either `actual` or `expected` is null, calling `actual.equals(expected)` will throw a NullPointerException. The original code had the same issue, but the new code should handle null values explicitly (e.g., using `Objects.equals(actual, expected)` for the non-numeric branch). for (int i = 0; i < actualValues.size(); i++) { Object actual = actualValues.get(i); Object expected = expectedValues.get(i); if (actual instanceof Number && expected instanceof Number) { if (!valuesAreClose((Number) actual, (Number) expected)) { return false; } } else if (!actual.equals(expected)) { return false; } Missing Approx The PR description mentions adding an `Approx` wrapper with an `approx(value, error)` factory method for per-value custom tolerance, but this feature does not appear in the diff. If this was intended to be part of the fix, it is missing from the implementation. public static TypeSafeMatcher<JSONArray> closeTo(Object... values) { return new TypeSafeMatcher<JSONArray>() { @Override protected boolean matchesSafely(JSONArray item) { List<Object> expectedValues = new ArrayList<>(Arrays.asList(values)); List<Object> actualValues = new ArrayList<>(); item.iterator().forEachRemaining(v -> actualValues.add((Object) v)); if (actualValues.size() != expectedValues.size()) { return false; } for (int i = 0; i < actualValues.size(); i++) { Object actual = actualValues.get(i); Object expected = expectedValues.get(i); if (actual instanceof Number && expected instanceof Number) { if (!valuesAreClose((Number) actual, (Number) expected)) { return false; } } else if (!actual.equals(expected)) { return false; } } return true; } @Override public void describeTo(Description description) { description.appendText(Arrays.toString(values)); } /** * ULP-aware comparison: tolerates up to {@link #ULP_TOLERANCE_FACTOR} ULPs or {@link * #ABSOLUTE_TOLERANCE}, whichever is larger. / private boolean valuesAreClose(Number v1, Number v2) { double d1 = v1.doubleValue(); double d2 = v2.doubleValue(); double diff = Math.abs(d1 - d2); double ulpTolerance = ULP_TOLERANCE_FACTOR Math.max(Math.ulp(d1), Math.ulp(d2)); return diff <= Math.max(ABSOLUTE_TOLERANCE, ulpTolerance); } }; }

github-actions · 2026-03-30T03:11:17Z

PR Code Suggestions ✨

Latest suggestions up to 9357beb
Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Guard against NaN and Infinity inputs When either `d1` or `d2` is `0.0`, `Math.ulp(0.0)` returns the smallest positive double (`~5e-324`), making `ulpTolerance` effectively zero and falling back only to `ABSOLUTE_TOLERANCE`. However, if one value is zero and the other is very small but non-zero, the ULP of the non-zero value could be much larger. Consider using `Math.max` of both ULPs to handle near-zero comparisons, but also guard against `NaN` or `Infinity` inputs which would cause `ulp` to return `NaN` or `Infinity`. integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java [347] +if (Double.isNaN(d1) \|\| Double.isNaN(d2) \|\| Double.isInfinite(d1) \|\| Double.isInfinite(d2)) { + return Double.compare(d1, d2) == 0; +} double ulpTolerance = ULP_TOLERANCE_FACTOR * Math.max(Math.ulp(d1), Math.ulp(d2)); Suggestion importance[1-10]: 4 __ Why: While guarding against `NaN` and `Infinity` is a valid concern, this is a test utility class where such edge cases are unlikely to occur in practice. The suggestion adds defensive code that may not be needed for the typical use case of comparing numeric test results.	Low
Possible issue	Handle null elements to prevent NullPointerException If any element in `actualValues` or `expectedValues` is `null`, calling `actual.equals(expected)` or casting to `Number` will throw a `NullPointerException`. Add null checks before performing the comparison to avoid unexpected test failures. integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java [316-330] item.iterator().forEachRemaining(v -> actualValues.add((Object) v)); if (actualValues.size() != expectedValues.size()) { return false; } for (int i = 0; i < actualValues.size(); i++) { Object actual = actualValues.get(i); Object expected = expectedValues.get(i); + if (actual == null \|\| expected == null) { + if (actual != expected) return false; + continue; + } if (actual instanceof Number && expected instanceof Number) { if (!valuesAreClose((Number) actual, (Number) expected)) { return false; } } else if (!actual.equals(expected)) { return false; } } Suggestion importance[1-10]: 4 __ Why: Adding null checks is a valid defensive programming practice, but in this test utility context, null values in `JSONArray` comparisons are an edge case. The `improved_code` accurately reflects the suggested change, but the impact is limited for a test helper class.	Low

Previous suggestions

Suggestions up to commit 3743dc5

Category	Suggestion	Impact
Possible issue	Fix boundary condition in approximate comparison The `Approx` comparison uses a strict greater-than (`>`) check, which means a difference exactly equal to the error is considered a failure. For consistency with `valuesAreClose` (which uses `<=`), the check should use `>=` or the condition should be inverted to `<= error` for the passing case. integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java [333-336] if (!(actual instanceof Number) \|\| Math.abs(((Number) actual).doubleValue() - ((Approx) expected).value) - > ((Approx) expected).error) { + >= ((Approx) expected).error) { Suggestion importance[1-10]: 4 __ Why: The suggestion points out a real but minor boundary inconsistency: `valuesAreClose` uses `<=` (passes when difference equals error), while `Approx` uses `>` (fails when difference equals error). In practice, exact equality at the boundary is extremely unlikely with floating-point values, making this a low-impact edge case.	Low
General	Explicitly mark captured local variable as final The variable `defaultError` is declared as a local variable inside the `closeTo` method but is only used inside the anonymous inner class `valuesAreClose`. While this works due to Java's effectively-final capture, it would be clearer and safer to ensure it is explicitly declared as `final` to make the intent explicit and prevent accidental reassignment. integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java [319] +final double defaultError = 1e-10; - Suggestion importance[1-10]: 1 __ Why: The `existing_code` and `improved_code` are identical, meaning no actual change is proposed. The variable `defaultError` is already effectively final, so this suggestion offers no functional improvement.	Low

songkant-aws · 2026-03-30T07:04:04Z

integ-test/src/test/java/org/opensearch/sql/calcite/tpch/CalcitePPLTpchIT.java

-            isPushdownDisabled() ? 35676192.096999995 : 35676192.097,
+            isPushdownDisabled()
+                ? 35676192.096999995
+                : approx(35676192.097, 1e-6), // workaround for arm64


The approx(35676192.097, 1e-6) change in CalcitePPLTpchIT.testQ1() looks too targeted for #5282. The issue shows at least two observed arm64 failures in IEEE-754 double standard:

35676192.096999995 vs 35676192.097 in CalcitePPLTpchIT.testQ1()

1041301.0700000001 vs 1041301.07 (and also 999060.8979999999 vs 999060.898) in CalcitePPLTpchPaginatingIT.testQ1()

Would it make sense to change the default numeric comparison in closeTo() to something ULP-aware instead of hardcoding a single absolute threshold? For example, something along the lines of:

abs(actual - expected) <= max(1e-10, k * max(Math.ulp(actual), Math.ulp(expected))), we could start with k = 4, aka 4 ULP error tolerance. Though the failed case has around 1 ULP error, we'd better set a conservative threshold.

Signed-off-by: Lantao Jin <ltjin@amazon.com>

github-actions · 2026-03-30T07:53:35Z

Persistent review updated to latest commit 9357beb

* Fix the flaky tpch Q1 Signed-off-by: Lantao Jin <ltjin@amazon.com> * Change to ULP-aware to handle floating-point precision differences Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 246ed0d) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

… (#5284) * Fix the flaky tpch Q1 * Change to ULP-aware to handle floating-point precision differences --------- (cherry picked from commit 246ed0d) Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Init CLAUDE.md (#5259) Signed-off-by: Heng Qian <qianheng@amazon.com> * Add label to exempt specific PRs from stalled labeling (#5263) * Implement `reverse` performance optimization (#4775) Co-authored-by: Jialiang Liang <jiallian@amazon.com> * Add songkant-aws as maintainer (#5244) * Move some maintainers from active to Emeritus (#5260) * Move inactive current maintainers to Emeritus Signed-off-by: Lantao Jin <ltjin@amazon.com> * Remove affiliation column for emeritus maintainers Signed-off-by: Lantao Jin <ltjin@amazon.com> * formatted Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix formatting in MAINTAINERS.md Signed-off-by: Simeon Widdis <sawiddis@gmail.com> Signed-off-by: Simeon Widdis <sawiddis@gmail.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Simeon Widdis <sawiddis@gmail.com> Co-authored-by: Simeon Widdis <sawiddis@gmail.com> * Add query cancellation support via _tasks/_cancel API for PPL queries (#5254) * Add query cancellation support via _tasks/_cancel API for PPL queries Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> * Refactor PPL query cancellation to cooperative model and other PR suggestions. Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> --------- Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> * Add Calcite native SQL planning in UnifiedQueryPlanner (#5257) * feat(api): Add Calcite native SQL planning path in UnifiedQueryPlanner Add SQL support to the unified query API using Calcite's native parser pipeline (SqlParser → SqlValidator → SqlToRelConverter → RelNode), bypassing the ANTLR parser used by PPL. Changes: - UnifiedQueryPlanner: use PlanningStrategy to dispatch CalciteNativeStrategy vs CustomVisitorStrategy - CalciteNativeStrategy: Calcite Planner with try-with-resources for ANSI SQL - CustomVisitorStrategy: ANTLR-based path for PPL (and future SQL V2) - UnifiedQueryContext: SqlParser.Config with Casing.UNCHANGED to preserve lowercase OpenSearch index names Signed-off-by: Chen Dai <daichen@amazon.com> * test(api): Add SQL planner tests and refactor test base for multi-language support - Refactor UnifiedQueryTestBase with queryType() hook for subclass override - Add UnifiedSqlQueryPlannerTest covering SELECT, WHERE, GROUP BY, JOIN, ORDER BY, subquery, case sensitivity, namespaces, and error handling - Update UnifiedQueryContextTest to verify SQL context creation Signed-off-by: Chen Dai <daichen@amazon.com> * perf(benchmarks): Add SQL queries to UnifiedQueryBenchmark Add language (PPL/SQL) and queryPattern param dimensions for side-by-side comparison of equivalent queries across both languages. Remove separate UnifiedSqlQueryBenchmark in favor of unified class. Signed-off-by: Chen Dai <daichen@amazon.com> * docs(api): Update README to reflect SQL support in UnifiedQueryPlanner Signed-off-by: Chen Dai <daichen@amazon.com> * fix(api): Normalize trailing whitespace in assertPlan comparison RelOptUtil.toString() appends a trailing newline after the last plan node, which doesn't match Java text block expectations. Also add \r\n normalization for Windows CI compatibility, consistent with the existing pattern in core module tests. Signed-off-by: Chen Dai <daichen@amazon.com> --------- Signed-off-by: Chen Dai <daichen@amazon.com> * [Feature] Support graphLookup with literal value as its start (#5253) * [Feature] Support graphLookup as top-level PPL command (#5243) Add support for graphLookup as the first command in a PPL query with literal start values, instead of requiring piped input from source=. Syntax: graphLookup table start="value" edge=from-->to as output graphLookup table start=("v1", "v2") edge=from-->to as output Signed-off-by: Heng Qian <qianheng@amazon.com> * Spotless check Signed-off-by: Heng Qian <qianheng@amazon.com> * Ignore child pipe if using start value Signed-off-by: Heng Qian <qianheng@amazon.com> * Add graphLookup integration tests per PPL command checklist - Add explain plan tests in CalciteExplainIT with YAML assertions - Add v2-unsupported tests in NewAddedCommandsIT - Add CalcitePPLGraphLookupIT to CalciteNoPushdownIT suite - Skip graphLookup tests when pushdown is disabled (required by impl) - Add expected plan YAML files for piped and top-level graphLookup Signed-off-by: Heng Qian <qianheng@amazon.com> * Remove brace of start value list Signed-off-by: Heng Qian <qianheng@amazon.com> --------- Signed-off-by: Heng Qian <qianheng@amazon.com> * Apply docs website feedback to ppl functions (#5207) * apply doc website feedback to ppl functions Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * take out comments Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * fix json_append example Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * fix json_append example Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * fix links Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> --------- Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com> * feat(api): Add profiling support to unified query API (#5268) Add query profiling infrastructure that measures time spent in each query phase (analyze, optimize, execute, format). Profiling is opt-in via UnifiedQueryContext.builder().profiling(true) and uses thread-local context to avoid passing profiling state through every method. Key changes: - QueryProfiling/ProfileContext for thread-local profiling lifecycle - UnifiedQueryContext.measure() API for timing arbitrary phases - Auto-profiling in UnifiedQueryPlanner (analyze) and compiler (optimize) - UnifiedQueryTestBase shared test fixture for unified query tests - Comprehensive profiling tests with non-flaky >= 0 timing assertions Signed-off-by: Chen Dai <daichen@amazon.com> * Add UnifiedQueryParser with language-specific implementations (#5274) Extract parsing logic from UnifiedQueryPlanner into a UnifiedQueryParser interface with language-specific implementations: PPLQueryParser (returns UnresolvedPlan) and CalciteSqlQueryParser (returns SqlNode). UnifiedQueryContext owns the parser instance, created eagerly by the builder which has direct access to query type and future SQL config. Each implementation receives only its required dependencies: PPLQueryParser takes Settings, CalciteSqlQueryParser takes CalcitePlanContext. UnifiedQueryPlanner.CustomVisitorStrategy now obtains the parser from the context via the interface type. Signed-off-by: Chen Dai <daichen@amazon.com> * Fix flaky TPC-H Q1 test due to bugs in `MatcherUtils.closeTo()` (#5283) * Fix the flaky tpch Q1 Signed-off-by: Lantao Jin <ltjin@amazon.com> * Change to ULP-aware to handle floating-point precision differences Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Heng Qian <qianheng@amazon.com> Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Simeon Widdis <sawiddis@gmail.com> Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> Signed-off-by: Chen Dai <daichen@amazon.com> Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> Co-authored-by: qianheng <qianheng@amazon.com> Co-authored-by: Simeon Widdis <sawiddis@gmail.com> Co-authored-by: Jialiang Liang <jiallian@amazon.com> Co-authored-by: Lantao Jin <ltjin@amazon.com> Co-authored-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> Co-authored-by: Chen Dai <daichen@amazon.com> Co-authored-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>

Fix the flaky tpch Q1

3743dc5

Signed-off-by: Lantao Jin <ltjin@amazon.com>

LantaoJin added bug Something isn't working flaky-test Flaky build or test issue backport 3.6 bugFix testing Related to improving software testing and removed bug Something isn't working labels Mar 30, 2026

LantaoJin marked this pull request as ready for review March 30, 2026 03:42

LantaoJin requested review from RyanL1997, Swiddis, acarbonetto, ahkcs, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, songkant-aws, vamsimanohar, ykmr1224 and yuancu as code owners March 30, 2026 03:42

songkant-aws reviewed Mar 30, 2026

View reviewed changes

Change to ULP-aware to handle floating-point precision differences

9357beb

Signed-off-by: Lantao Jin <ltjin@amazon.com>

songkant-aws approved these changes Mar 30, 2026

View reviewed changes

yuancu approved these changes Mar 30, 2026

View reviewed changes

yuancu merged commit 246ed0d into opensearch-project:main Mar 30, 2026
40 checks passed

opensearch-trigger-bot bot mentioned this pull request Mar 30, 2026

[Backport 3.6] Fix flaky TPC-H Q1 test due to bugs in MatcherUtils.closeTo() #5284

Merged

LantaoJin deleted the pr/issues/5282 branch March 30, 2026 09:00

github-actions bot mentioned this pull request Apr 4, 2026

[AUTO] Add release notes for 3.6.0 #5314

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky TPC-H Q1 test due to bugs in `MatcherUtils.closeTo()`#5283

Fix flaky TPC-H Q1 test due to bugs in `MatcherUtils.closeTo()`#5283
yuancu merged 2 commits intoopensearch-project:mainfrom
LantaoJin:pr/issues/5282

LantaoJin commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

songkant-aws Mar 30, 2026 •

edited

Loading

Uh oh!

LantaoJin Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LantaoJin commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 9357beb)

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

songkant-aws Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LantaoJin Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LantaoJin commented Mar 30, 2026 •

edited

Loading

github-actions bot commented Mar 30, 2026 •

edited

Loading

(Review updated until commit `9357beb`)

github-actions bot commented Mar 30, 2026 •

edited

Loading

songkant-aws Mar 30, 2026 •

edited

Loading