Add unified query parser API#5274
Merged
dai-chen merged 1 commit intoopensearch-project:mainfrom Mar 27, 2026
Merged
Conversation
Contributor
|
Failed to generate code suggestions for PR |
0676f32 to
9ed26fe
Compare
Contributor
|
Failed to generate code suggestions for PR |
9ed26fe to
eb41b88
Compare
Contributor
|
Failed to generate code suggestions for PR |
eb41b88 to
0fc6009
Compare
Contributor
|
Failed to generate code suggestions for PR |
penghuo
previously approved these changes
Mar 27, 2026
0fc6009 to
9f2bf0c
Compare
Contributor
|
Failed to generate code suggestions for PR |
Contributor
|
Failed to generate code suggestions for PR |
Extract parsing logic from UnifiedQueryPlanner into a UnifiedQueryParser interface with language-specific implementations: PPLQueryParser (returns UnresolvedPlan) and CalciteSqlQueryParser (returns SqlNode). UnifiedQueryContext owns the parser instance, created eagerly by the builder which has direct access to query type and future SQL config. Each implementation receives only its required dependencies: PPLQueryParser takes Settings, CalciteSqlQueryParser takes CalcitePlanContext. UnifiedQueryPlanner.CustomVisitorStrategy now obtains the parser from the context via the interface type. Signed-off-by: Chen Dai <daichen@amazon.com>
9f2bf0c to
7110f63
Compare
Contributor
|
Failed to generate code suggestions for PR |
ahkcs
approved these changes
Mar 27, 2026
Collaborator
Author
|
@penghuo Addressed conflicts. Please approve again. Thanks! |
mengweieric
approved these changes
Mar 27, 2026
This was referenced Mar 30, 2026
ahkcs
added a commit
that referenced
this pull request
Mar 30, 2026
* Init CLAUDE.md (#5259) Signed-off-by: Heng Qian <qianheng@amazon.com> * Add label to exempt specific PRs from stalled labeling (#5263) * Implement `reverse` performance optimization (#4775) Co-authored-by: Jialiang Liang <jiallian@amazon.com> * Add songkant-aws as maintainer (#5244) * Move some maintainers from active to Emeritus (#5260) * Move inactive current maintainers to Emeritus Signed-off-by: Lantao Jin <ltjin@amazon.com> * Remove affiliation column for emeritus maintainers Signed-off-by: Lantao Jin <ltjin@amazon.com> * formatted Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix formatting in MAINTAINERS.md Signed-off-by: Simeon Widdis <sawiddis@gmail.com> Signed-off-by: Simeon Widdis <sawiddis@gmail.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Simeon Widdis <sawiddis@gmail.com> Co-authored-by: Simeon Widdis <sawiddis@gmail.com> * Add query cancellation support via _tasks/_cancel API for PPL queries (#5254) * Add query cancellation support via _tasks/_cancel API for PPL queries Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> * Refactor PPL query cancellation to cooperative model and other PR suggestions. Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> --------- Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> * Add Calcite native SQL planning in UnifiedQueryPlanner (#5257) * feat(api): Add Calcite native SQL planning path in UnifiedQueryPlanner Add SQL support to the unified query API using Calcite's native parser pipeline (SqlParser → SqlValidator → SqlToRelConverter → RelNode), bypassing the ANTLR parser used by PPL. Changes: - UnifiedQueryPlanner: use PlanningStrategy to dispatch CalciteNativeStrategy vs CustomVisitorStrategy - CalciteNativeStrategy: Calcite Planner with try-with-resources for ANSI SQL - CustomVisitorStrategy: ANTLR-based path for PPL (and future SQL V2) - UnifiedQueryContext: SqlParser.Config with Casing.UNCHANGED to preserve lowercase OpenSearch index names Signed-off-by: Chen Dai <daichen@amazon.com> * test(api): Add SQL planner tests and refactor test base for multi-language support - Refactor UnifiedQueryTestBase with queryType() hook for subclass override - Add UnifiedSqlQueryPlannerTest covering SELECT, WHERE, GROUP BY, JOIN, ORDER BY, subquery, case sensitivity, namespaces, and error handling - Update UnifiedQueryContextTest to verify SQL context creation Signed-off-by: Chen Dai <daichen@amazon.com> * perf(benchmarks): Add SQL queries to UnifiedQueryBenchmark Add language (PPL/SQL) and queryPattern param dimensions for side-by-side comparison of equivalent queries across both languages. Remove separate UnifiedSqlQueryBenchmark in favor of unified class. Signed-off-by: Chen Dai <daichen@amazon.com> * docs(api): Update README to reflect SQL support in UnifiedQueryPlanner Signed-off-by: Chen Dai <daichen@amazon.com> * fix(api): Normalize trailing whitespace in assertPlan comparison RelOptUtil.toString() appends a trailing newline after the last plan node, which doesn't match Java text block expectations. Also add \r\n normalization for Windows CI compatibility, consistent with the existing pattern in core module tests. Signed-off-by: Chen Dai <daichen@amazon.com> --------- Signed-off-by: Chen Dai <daichen@amazon.com> * [Feature] Support graphLookup with literal value as its start (#5253) * [Feature] Support graphLookup as top-level PPL command (#5243) Add support for graphLookup as the first command in a PPL query with literal start values, instead of requiring piped input from source=. Syntax: graphLookup table start="value" edge=from-->to as output graphLookup table start=("v1", "v2") edge=from-->to as output Signed-off-by: Heng Qian <qianheng@amazon.com> * Spotless check Signed-off-by: Heng Qian <qianheng@amazon.com> * Ignore child pipe if using start value Signed-off-by: Heng Qian <qianheng@amazon.com> * Add graphLookup integration tests per PPL command checklist - Add explain plan tests in CalciteExplainIT with YAML assertions - Add v2-unsupported tests in NewAddedCommandsIT - Add CalcitePPLGraphLookupIT to CalciteNoPushdownIT suite - Skip graphLookup tests when pushdown is disabled (required by impl) - Add expected plan YAML files for piped and top-level graphLookup Signed-off-by: Heng Qian <qianheng@amazon.com> * Remove brace of start value list Signed-off-by: Heng Qian <qianheng@amazon.com> --------- Signed-off-by: Heng Qian <qianheng@amazon.com> * Apply docs website feedback to ppl functions (#5207) * apply doc website feedback to ppl functions Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * take out comments Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * fix json_append example Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * fix json_append example Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> * fix links Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> --------- Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com> * feat(api): Add profiling support to unified query API (#5268) Add query profiling infrastructure that measures time spent in each query phase (analyze, optimize, execute, format). Profiling is opt-in via UnifiedQueryContext.builder().profiling(true) and uses thread-local context to avoid passing profiling state through every method. Key changes: - QueryProfiling/ProfileContext for thread-local profiling lifecycle - UnifiedQueryContext.measure() API for timing arbitrary phases - Auto-profiling in UnifiedQueryPlanner (analyze) and compiler (optimize) - UnifiedQueryTestBase shared test fixture for unified query tests - Comprehensive profiling tests with non-flaky >= 0 timing assertions Signed-off-by: Chen Dai <daichen@amazon.com> * Add UnifiedQueryParser with language-specific implementations (#5274) Extract parsing logic from UnifiedQueryPlanner into a UnifiedQueryParser interface with language-specific implementations: PPLQueryParser (returns UnresolvedPlan) and CalciteSqlQueryParser (returns SqlNode). UnifiedQueryContext owns the parser instance, created eagerly by the builder which has direct access to query type and future SQL config. Each implementation receives only its required dependencies: PPLQueryParser takes Settings, CalciteSqlQueryParser takes CalcitePlanContext. UnifiedQueryPlanner.CustomVisitorStrategy now obtains the parser from the context via the interface type. Signed-off-by: Chen Dai <daichen@amazon.com> * Fix flaky TPC-H Q1 test due to bugs in `MatcherUtils.closeTo()` (#5283) * Fix the flaky tpch Q1 Signed-off-by: Lantao Jin <ltjin@amazon.com> * Change to ULP-aware to handle floating-point precision differences Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Heng Qian <qianheng@amazon.com> Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Simeon Widdis <sawiddis@gmail.com> Signed-off-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> Signed-off-by: Chen Dai <daichen@amazon.com> Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com> Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> Co-authored-by: qianheng <qianheng@amazon.com> Co-authored-by: Simeon Widdis <sawiddis@gmail.com> Co-authored-by: Jialiang Liang <jiallian@amazon.com> Co-authored-by: Lantao Jin <ltjin@amazon.com> Co-authored-by: Sunil Ramchandra Pawar <pawar_sr@apple.com> Co-authored-by: Chen Dai <daichen@amazon.com> Co-authored-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>
ahkcs
added a commit
to ahkcs/sql
that referenced
this pull request
Mar 30, 2026
…rch-project#5247) Task 1: Enable profiling (opensearch-project#5268) - Add .profiling(pplRequest.profile()) to UnifiedQueryContext.builder() in both doExecute and doExplain Task 2: Migrate to UnifiedQueryParser for index extraction (opensearch-project#5274) - Replace StubIndexDetector regex with PPLQueryParser AST-based extraction: parse query, walk AST to find Relation node, extract table name via getTableQualifiedName() - Delete StubIndexDetector - isAnalyticsIndex() is now an instance method (needs PPLQueryParser) - Constructor takes Settings for PPLQueryParser Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
ahkcs
added a commit
that referenced
this pull request
Mar 31, 2026
* [Mustang] Enable profiling and migrate to UnifiedQueryParser (#5247) Task 1: Enable profiling (#5268) - Add .profiling(pplRequest.profile()) to UnifiedQueryContext.builder() in both doExecute and doExplain Task 2: Migrate to UnifiedQueryParser for index extraction (#5274) - Replace StubIndexDetector regex with PPLQueryParser AST-based extraction: parse query, walk AST to find Relation node, extract table name via getTableQualifiedName() - Delete StubIndexDetector - isAnalyticsIndex() is now an instance method (needs PPLQueryParser) - Constructor takes Settings for PPLQueryParser Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> * Switch to SimpleJsonResponseFormatter for profiling support Switch from JdbcResponseFormatter to SimpleJsonResponseFormatter so profiling data is included in the response when profile=true. The SimpleJsonResponseFormatter calls QueryProfiling.current().finish() to populate the profile field. Update test assertions to match SimpleJsonResponseFormatter type names (PPL_SPEC: INTEGER -> "int", STRING -> "string") and remove status field check (not included by SimpleJsonResponseFormatter). Add integration test verifying profile field appears in response. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> * Use context parser for index extraction instead of standalone PPLQueryParser Create UnifiedQueryContext upfront in isAnalyticsIndex() and use context.getParser() for index name extraction. This reuses the context-owned parser which supports both PPL and SQL, making it ready for unified SQL support without code changes. Remove standalone PPLQueryParser field and Settings constructor param. isAnalyticsIndex() now takes QueryType to create the right context. extractIndexName() handles UnresolvedPlan (PPL) with a TODO for SqlNode (SQL) when unified SQL is enabled. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> * Use AST visitor for index name extraction Replace manual tree walking (findRelation) with IndexNameExtractor visitor extending AbstractNodeVisitor. The visitor's visitRelation() is called automatically by the AST accept/visitChildren pattern, which handles tree traversal. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> * Wrap execute and explain with context.measure() for profiling Wrap analyticsEngine.execute() and analyticsEngine.explain() calls with context.measure(MetricName.EXECUTE, ...) so execution time is captured in the profiling metrics. Planning is auto-profiled by UnifiedQueryPlanner. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> * Fix EXECUTE profiling metric by recording inside AnalyticsExecutionEngine Move EXECUTE metric recording into AnalyticsExecutionEngine.execute(), between the actual execution (planExecutor + row conversion) and the listener.onResponse() call. This ensures the metric is written before SimpleJsonResponseFormatter calls QueryProfiling.finish() to snapshot. Previously context.measure() was used in RestUnifiedQueryAction, but finish() was called inside the listener callback (synchronously) before measure()'s finally block could write the metric, resulting in 0ms. Add IT assertion that execute phase time_ms > 0 to catch this bug. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> --------- Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Extract parsing logic from UnifiedQueryPlanner into a
UnifiedQueryParser<R>interface with language-specific implementations:PPLQueryParser(returnsUnresolvedPlan) andCalciteSqlQueryParser(returnsSqlNode).Implementation Notes
UnifiedQueryContextcreates and owns the parser, ensuring consistent configuration across all components.parse(query, visitor)was explored but deferred. PPL's custom AST and Calcite's SqlNode tree have fundamentally different node structures. Until this tension resolves, callers use native visitors on the typedparse(query)result.Related Issues
Part of #5248
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.