Enable profiling and migrate to UnifiedQueryParser#5285
Conversation
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 1e683b6. 'Diff too large, requires skip by maintainers after manual review' Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
…rch-project#5247) Task 1: Enable profiling (opensearch-project#5268) - Add .profiling(pplRequest.profile()) to UnifiedQueryContext.builder() in both doExecute and doExplain Task 2: Migrate to UnifiedQueryParser for index extraction (opensearch-project#5274) - Replace StubIndexDetector regex with PPLQueryParser AST-based extraction: parse query, walk AST to find Relation node, extract table name via getTableQualifiedName() - Delete StubIndexDetector - isAnalyticsIndex() is now an instance method (needs PPLQueryParser) - Constructor takes Settings for PPLQueryParser Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
1e683b6 to
5addc89
Compare
|
Failed to generate code suggestions for PR |
dai-chen
left a comment
There was a problem hiding this comment.
For the profiling, is the result automatically populated into REST response? Is it possible to add integration test or verify manually?
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java
Outdated
Show resolved
Hide resolved
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java
Outdated
Show resolved
Hide resolved
|
|
||
| analyticsEngine.execute( | ||
| plan, planContext, createQueryListener(queryType, planTime, listener)); | ||
| analyticsEngine.execute(plan, planContext, createQueryListener(queryType, listener)); |
There was a problem hiding this comment.
wrap by context.measure?
There was a problem hiding this comment.
Updated. Wrapped both analyticsEngine.execute() and analyticsEngine.explain() with context.measure(MetricName.EXECUTE, ...). Planning is auto-profiled by UnifiedQueryPlanner internally.
There was a problem hiding this comment.
Updated reply: Moved EXECUTE metric recording into AnalyticsExecutionEngine.execute() instead of wrapping with context.measure(). The issue with context.measure() is that SimpleJsonResponseFormatter calls QueryProfiling.finish() inside the listener callback (synchronously), which snapshots all metrics before context.measure()'s finally block writes the EXECUTE metric — resulting in 0ms.
The fix records the metric between the actual execution (planExecutor + row conversion) and the listener.onResponse() call, so it's captured before finish() snapshots. Added IT that asserts execute phase time_ms > 0 to catch this
There was a problem hiding this comment.
I see. I feel it shouldn't do that. For now, is format metric missing?
There was a problem hiding this comment.
FORMAT metric is not missing — it's recorded by SimpleJsonResponseFormatter.buildJsonObject() itself, before calling finish(). The IT output confirms all phases:
analyze: 13.69ms, execute: 126.32ms, format: 1.29ms, optimize: 0ms
There was a problem hiding this comment.
SimpleJsonResponseFormatter calls finish() inside buildJsonObject() to snapshot metrics. Any metric written after that (like context.measure()'s finally block) is missed — that's why we record EXECUTE manually before the listener fires.
The main PPL path uses the same SimpleJsonResponseFormatter with the same finish() pattern, so this isn't specific to our analytics path.
To fully adopt context.measure(), I think the formatter would need to stop calling finish() internally. Should we make this a follow-up PR?
PR Reviewer Guide 🔍(Review updated until commit 7347fef)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 7347fef Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 1e4fba8
Suggestions up to commit ba3efd7
Suggestions up to commit ce4f6c5
Suggestions up to commit 2545d34
Suggestions up to commit 2cf5183
|
Switch from JdbcResponseFormatter to SimpleJsonResponseFormatter so profiling data is included in the response when profile=true. The SimpleJsonResponseFormatter calls QueryProfiling.current().finish() to populate the profile field. Update test assertions to match SimpleJsonResponseFormatter type names (PPL_SPEC: INTEGER -> "int", STRING -> "string") and remove status field check (not included by SimpleJsonResponseFormatter). Add integration test verifying profile field appears in response. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
ce2589b to
b0ecfc2
Compare
|
Persistent review updated to latest commit b0ecfc2 |
The profiling result is now populated into the REST response. Switched from The EXECUTE metric is recorded inside Added integration test that verifies:
|
…yParser Create UnifiedQueryContext upfront in isAnalyticsIndex() and use context.getParser() for index name extraction. This reuses the context-owned parser which supports both PPL and SQL, making it ready for unified SQL support without code changes. Remove standalone PPLQueryParser field and Settings constructor param. isAnalyticsIndex() now takes QueryType to create the right context. extractIndexName() handles UnresolvedPlan (PPL) with a TODO for SqlNode (SQL) when unified SQL is enabled. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 022461a |
Replace manual tree walking (findRelation) with IndexNameExtractor visitor extending AbstractNodeVisitor. The visitor's visitRelation() is called automatically by the AST accept/visitChildren pattern, which handles tree traversal. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 2cf5183 |
|
Persistent review updated to latest commit 2545d34 |
Wrap analyticsEngine.execute() and analyticsEngine.explain() calls with context.measure(MetricName.EXECUTE, ...) so execution time is captured in the profiling metrics. Planning is auto-profiled by UnifiedQueryPlanner. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
2545d34 to
ce4f6c5
Compare
|
Persistent review updated to latest commit ce4f6c5 |
|
Persistent review updated to latest commit ba3efd7 |
ba3efd7 to
1e4fba8
Compare
|
Persistent review updated to latest commit 1e4fba8 |
…gine Move EXECUTE metric recording into AnalyticsExecutionEngine.execute(), between the actual execution (planExecutor + row conversion) and the listener.onResponse() call. This ensures the metric is written before SimpleJsonResponseFormatter calls QueryProfiling.finish() to snapshot. Previously context.measure() was used in RestUnifiedQueryAction, but finish() was called inside the listener callback (synchronously) before measure()'s finally block could write the metric, resulting in 0ms. Add IT assertion that execute phase time_ms > 0 to catch this bug. Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
1e4fba8 to
7347fef
Compare
|
Persistent review updated to latest commit 7347fef |
|
Did verification on commands like |
151cffc
into
opensearch-project:feature/mustang-ppl-integration
Summary
Implements two tasks from #5247 comment:
Task 1: Enable profiling (#5268)
Add
.profiling(pplRequest.profile())toUnifiedQueryContext.builder()in bothdoExecuteanddoExplain. Whenprofile=trueis requested, profiling metrics are collected during query planning and execution via the unified query API's auto-profiling support.Task 2: Migrate to UnifiedQueryParser (#5274)
Replace
StubIndexDetector(regex-based index extraction) withPPLQueryParser(AST-based). Parse the query to get theUnresolvedPlan, walk the AST tree to find theRelationnode, and extract the table name viagetTableQualifiedName(). This is more robust than regex and uses the same parser infrastructure as the rest of the pipeline.Changed Files
plugin/.../rest/RestUnifiedQueryAction.javaplugin/.../transport/TransportPPLQueryAction.javaplugin/.../rest/analytics/stub/StubIndexDetector.javaplugin/.../rest/RestUnifiedQueryActionTest.javaTest plan