-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feat](profile) support getting query progress #51400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 34327 ms |
TPC-DS: Total hot run time: 193134 ms |
ClickBench: Total hot run time: 29.94 s |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 33915 ms |
TPC-DS: Total hot run time: 186017 ms |
ClickBench: Total hot run time: 28.71 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for retrieving real‐time query progress using a new FE HTTP API while addressing a few related issues and enhancements. Key changes include:
- Adding a new API endpoint (/rest/v2/manager/query/statistics/{trace_id}) for query runtime statistics.
- Fixing a bug in CoordinatorContext to use the proper host value.
- Updating schema definitions and test scripts to align column names (e.g., adding trace id) and modifying backend interfaces to support query statistics.
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/compaction/test_single_compaction_with_variant_inverted_index.groovy | Updates in logging and variable naming for process management. |
| regression-test/plugins/plugin_curl_requester.groovy | Added authentication parameters to curl command extension. |
| gensrc/thrift/BackendService.thrift | Added new fields for req_type and query_stats for supporting runtime query statistics. |
| fe/fe-core/src/main/java/org/apache/doris/qe/CoordinatorContext.java | Fixed bug to use worker.host() instead of worker.address() for proper host resolution. |
| fe/fe-core/src/main/java/org/apache/doris/qe/ConnectScheduler.java | Added method to remove old trace IDs from scheduler state. |
| fe/fe-core/src/main/java/org/apache/doris/qe/ConnectPoolMgr.java | Updated connection unregister logic to remove trace IDs from mapping. |
| fe/fe-core/src/main/java/org/apache/doris/qe/ConnectContext.java | Cleaned up trace id management and ensured updated session rows include trace id. |
| fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowProcessListCommand.java | Modified process list metadata construction to use SchemaTable definitions. |
| fe/fe-core/src/main/java/org/apache/doris/httpv2/rest/manager/QueryProfileAction.java | Introduced new API endpoints for retrieving query statistics and updated JSON parsing logic. |
| fe/fe-core/src/main/java/org/apache/doris/httpv2/rest/manager/HttpUtils.java | Added helper method to check if an FE is the current instance. |
| fe/fe-core/src/main/java/org/apache/doris/httpv2/controller/SessionController.java | Adjusted session header construction utilizing SchemaTable metadata. |
| fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileManager.java | Added support for fetching and summarizing query statistics updated from backend responses. |
| fe/fe-core/src/main/java/org/apache/doris/catalog/SchemaTable.java | Updated the processlist schema to include a trace id column for better MySQL compatibility. |
| fe/fe-core/src/main/java/org/apache/doris/analysis/ShowProcesslistStmt.java | Revised metadata construction for SHOW PROCESSLIST to match the updated schema. |
| be/src/service/backend_service.cpp | Added support for handling req_type "stats" to return query statistics. |
| be/src/runtime/runtime_query_statistics_mgr.h & .cpp | Introduced a method to gather query statistics from runtime resource contexts. |
| be/src/runtime/fragment_mgr.h & .cpp | Added interface and implementation for retrieving query statistics via fragment manager. |
| be/src/exec/schema_scanner/schema_processlist_scanner.cpp | Adjusted schema scanner to insert an empty value for the new trace id column when missing. |
Comments suppressed due to low confidence (1)
fe/fe-core/src/main/java/org/apache/doris/httpv2/rest/manager/QueryProfileAction.java:352
- The variable 'responseJson' is used without being defined. Please assign its value using HttpUtils.doGet(url, header) before parsing.
JsonObject jObj = JsonParser.parseString(responseJson).getAsJsonObject();
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 34002 ms |
TPC-DS: Total hot run time: 193350 ms |
ClickBench: Total hot run time: 29.77 s |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 34175 ms |
TPC-DS: Total hot run time: 187077 ms |
ClickBench: Total hot run time: 29.11 s |
e1c29eb to
78bc77a
Compare
|
run buildall |
8b3af81 to
62e2997
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 34195 ms |
TPC-DS: Total hot run time: 192577 ms |
ClickBench: Total hot run time: 30.43 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
What problem does this PR solve?
Followup #50791
Add a new FE HTTP API:
/rest/v2/manager/query/statistics/trace_id.This API will return the query runtime statistic corresponding to a given trace id.
The query statistics includes info such as real-time scan rows/bytes.
Internally, Doris will get query id by trace id from all Frontends, and then fetch query statistics from BE.
Use pattern:
set session_context="trace_id:my_trace_id"Also fix a bug in
CoordinatorContext.java, to get real host. introduced from #41730This PR also change the column name of
information_schema.processlisttable, to be same as columnname in
show processlist.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)