Add unified query API for external integration#3783
Conversation
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
| .cacheSchema(true) | ||
| .build(); | ||
|
|
||
| RelNode plan = planner.plan("source = opensearch.test"); |
There was a problem hiding this comment.
How does one execute the plan after they receive it?
There was a problem hiding this comment.
Good question. Currently, the plan isn’t directly executable. As noted in the README, the planner is designed to eventually return an executable plan—either a Calcite physical plan for immediate execution in the current JVM (useful for the OpenSearch plugin and CLI), or a SparkSQL plan for distributed execution by Spark (useful for PPL in Spark).
I initially considered designing the API this way, but haven’t yet found a clean way to model everything within Calcite’s optimizer. I plan to work on this later, especially since PPL in Spark Phase 2 may require it.
Signed-off-by: Chen Dai <daichen@amazon.com>
|
@LantaoJin @penghuo Please have a look when you have a moment. This is currently only for initial phase in opensearch-project/opensearch-spark#1136 so we can begin publishing PRs on Spark side. Thanks! |
|
There seems flaky test. |
c0858b5
into
opensearch-project:feature/unified-ppl
* Add api module with API and UT Signed-off-by: Chen Dai <daichen@amazon.com> * Refactor catalog API and clean up build.gradle Signed-off-by: Chen Dai <daichen@amazon.com> * Add cache schema API and refactor UT Signed-off-by: Chen Dai <daichen@amazon.com> * Add readme Signed-off-by: Chen Dai <daichen@amazon.com> * Add comment for hardcoding query size limit Signed-off-by: Chen Dai <daichen@amazon.com> * Add default namespace API with more UTs Signed-off-by: Chen Dai <daichen@amazon.com> --------- Signed-off-by: Chen Dai <daichen@amazon.com>
Description
This PR introduces a new api module containing the
UnifiedQueryPlannerclass, which provides a high-level interface for parsing and planning PPL queries. This module is designed to support external consumers such as Spark and CLI without exposing Calcite or OpenSearch internals. README and unit tests are included to document usage and verify correctness.Related Issues
Resolves #3734
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.