-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[enhancement](be_metrics) update scan bytes metric in file_scanner. #53729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 34151 ms |
TPC-DS: Total hot run time: 187721 ms |
ClickBench: Total hot run time: 32.27 s |
|
run buildall |
TPC-H: Total hot run time: 33935 ms |
TPC-DS: Total hot run time: 186684 ms |
ClickBench: Total hot run time: 32.6 s |
|
run buildall |
TPC-H: Total hot run time: 33933 ms |
TPC-DS: Total hot run time: 187717 ms |
ClickBench: Total hot run time: 32.23 s |
|
run buildall |
TPC-H: Total hot run time: 34243 ms |
TPC-DS: Total hot run time: 188116 ms |
ClickBench: Total hot run time: 32.96 s |
|
run buildall |
TPC-H: Total hot run time: 33661 ms |
TPC-DS: Total hot run time: 187681 ms |
ClickBench: Total hot run time: 32.58 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the scan bytes metric tracking in file scanners by implementing a unified framework for metrics collection across different file formats (Parquet, ORC, JSON, CSV, Arrow). The enhancement focuses on providing more accurate metrics for bytes read and rows scanned at the file reader level.
- Introduces a new
TracingFileReaderwrapper that tracks file read operations - Adds
FileReaderStatsstructure to capture read metrics consistently across file formats - Updates file scanners to use the new metrics tracking through the
update_realtime_counters()method
Reviewed Changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| contrib/apache-orc | Updates ORC submodule commit reference |
| be/src/io/io_common.h | Adds FileReaderStats structure and integrates it into IOContext |
| be/src/io/fs/tracing_file_reader.h | Implements new TracingFileReader wrapper for metrics collection |
| be/src/vec/exec/scan/file_scanner.h | Adds metrics counters and update_realtime_counters() method |
| be/src/vec/exec/scan/file_scanner.cpp | Implements metrics tracking logic and counter updates |
| be/src/vec/exec/format/parquet/vparquet_reader.h | Updates Parquet reader to support new metrics framework |
| be/src/vec/exec/format/parquet/vparquet_reader.cpp | Integrates TracingFileReader for Parquet format |
| be/src/vec/exec/format/orc/vorc_reader.h | Updates ORC reader structure for new metrics |
| be/src/vec/exec/format/orc/vorc_reader.cpp | Integrates TracingFileReader for ORC format |
| be/src/vec/exec/format/json/new_json_reader.cpp | Adds TracingFileReader wrapper for JSON format |
| be/src/vec/exec/format/csv/csv_reader.cpp | Integrates metrics tracking for CSV format |
| be/src/vec/exec/format/arrow/arrow_stream_reader.cpp | Adds metrics support for Arrow format |
| be/src/vec/exec/scan/olap_scanner.cpp | Updates metrics collection logic for OLAP scanner |
| _state->get_query_ctx() | ||
| ->resource_ctx() | ||
| ->io_context() | ||
| ->update_scan_bytes_from_remote_storage(_file_reader_stats->read_bytes); |
Copilot
AI
Jul 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When file cache statistics show no local or remote reads, the code incorrectly updates remote storage bytes with total read bytes. This should update local storage bytes since the comment indicates 'In case of no cache, we still need to update the IO stats'.
| ->update_scan_bytes_from_remote_storage(_file_reader_stats->read_bytes); | |
| ->update_scan_bytes_from_local_storage(_file_reader_stats->read_bytes); |
| ->resource_ctx() | ||
| ->io_context() | ||
| ->update_scan_bytes_from_remote_storage(_file_reader_stats->read_bytes); | ||
| DorisMetrics::instance()->query_scan_bytes_from_local->increment( |
Copilot
AI
Jul 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent metrics reporting: the code updates remote storage bytes in the IOContext but increments local bytes in DorisMetrics when there's no cache. Both should be consistent - either local or remote.
| DorisMetrics::instance()->query_scan_bytes_from_local->increment( | |
| DorisMetrics::instance()->query_scan_bytes_from_remote->increment( |
| getName(), | ||
| std::make_shared<io::TracingFileReader>(_file_reader, _io_ctx->file_reader_stats), |
Copilot
AI
Jul 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code creates a StripeStreamInputStream with the wrong number of parameters. The constructor expects (file_name, file_reader, io_ctx, profile) but is being called with (getName(), tracing_file_reader, _io_ctx, _profile) where getName() should be the first parameter, not a separate line.
| getName(), | |
| std::make_shared<io::TracingFileReader>(_file_reader, _io_ctx->file_reader_stats), | |
| getName(), std::make_shared<io::TracingFileReader>(_file_reader, _io_ctx->file_reader_stats), |
|
run buildall |
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 34288 ms |
TPC-DS: Total hot run time: 188020 ms |
ClickBench: Total hot run time: 32.62 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…pache#53729) Problem Summary: The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in apache#52232. However, - Scan bytes in the external table currently represents the bytes counted by **the top-level File Reader** called by the scan reader layer. - Scan rows represents the number of scanned rows of the underlying storage. The number of scanned rows of parquet/orc does not include the number of rows of skipped page/rowgroup. **Note: However, there is still a problem that `jni_reader` has not yet implemented the number of rows that only contain the storage to be read.**
…pache#53729) Problem Summary: The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in apache#52232. However, - Scan bytes in the external table currently represents the bytes counted by **the top-level File Reader** called by the scan reader layer. - Scan rows represents the number of scanned rows of the underlying storage. The number of scanned rows of parquet/orc does not include the number of rows of skipped page/rowgroup. **Note: However, there is still a problem that `jni_reader` has not yet implemented the number of rows that only contain the storage to be read.**
…pache#53729) ### What problem does this PR solve? Problem Summary: ### Release note The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in apache#52232. However, - Scan bytes in the external table currently represents the bytes counted by **the top-level File Reader** called by the scan reader layer. - Scan rows represents the number of scanned rows of the underlying storage. The number of scanned rows of parquet/orc does not include the number of rows of skipped page/rowgroup. **Note: However, there is still a problem that `jni_reader` has not yet implemented the number of rows that only contain the storage to be read.**
…55346) ### What problem does this PR solve? Related PR: #53729 Problem Summary: runtime error: member access within null pointer of type 'FileReaderStats' in `TracingFileReader` ``` /home/zcp/repo_center/doris_branch-3.1/doris/be/src/io/fs/tracing_file_reader.h:34:9: runtime error: member access within null pointer of type 'FileReaderStats' #0 0x55c028481ace in doris::io::TracingFileReader::read_at_impl(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/io/fs/tracing_file_reader.h:34:9 #1 0x55c00922a1f7 in doris::io::FileReader::read_at(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/io/fs/file_reader.cpp:34:17 #2 0x55c028ae8a18 in doris::vectorized::parse_thrift_footer(std::shared_ptr<doris::io::FileReader>, doris::vectorized::FileMetaData**, unsigned long*, doris::io::IOContext*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/vec/exec/format/parquet/parquet_thrift_util.h:44:5 #3 0x55c028ae8a18 in doris::vectorized::ParquetReader::_open_file() /home/zcp/repo_center/doris_branch-3.1/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:234:21 #4 0x55c028aeb69c in doris::vectorized::ParquetReader::init_reader(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::variant<doris::ColumnValueRange<(doris::PrimitiveType)3>, doris::ColumnValueRange<(doris::PrimitiveType)4>, doris::ColumnValueRange<(doris::PrimitiveType)5>, doris::ColumnValueRange<(doris::PrimitiveType)6>, doris::ColumnValueRange<(doris::PrimitiveType)7>, doris::ColumnValueRange<(doris::PrimitiveType)36>, doris::ColumnValueRange<(doris::PrimitiveType)37>, doris::ColumnValueRange<(doris::PrimitiveType)15>, doris::ColumnValueRange<(doris::PrimitiveType)10>, doris::ColumnValueRange<(doris::PrimitiveType)23>, doris::ColumnValueRange<(doris::PrimitiveType)11>, doris::ColumnValueRange<(doris::PrimitiveType)25>, doris::ColumnValueRange<(doris::PrimitiveType)12>, doris::ColumnValueRange<(doris::PrimitiveType)26>, doris::ColumnValueRange<(doris::PrimitiveType)20>, doris::ColumnValueRange<(doris::PrimitiveType)2>, doris::ColumnValueRange<(doris::PrimitiveType)19>, doris::ColumnValueRange<(doris::PrimitiveType)28>, doris::ColumnValueRange<(doris::PrimitiveType)29>, doris::ColumnValueRange<(doris::PrimitiveType)30>, doris::ColumnValueRange<(doris::PrimitiveType)35>>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, std::variant<doris::ColumnValueRange<(doris::PrimitiveType)3>, doris::ColumnValueRange<(doris::PrimitiveType)4>, doris::ColumnValueRange<(doris::PrimitiveType)5>, doris::ColumnValueRange<(doris::PrimitiveType)6>, doris::ColumnValueRange<(doris::PrimitiveType)7>, doris::ColumnValueRange<(doris::PrimitiveType)36>, doris::ColumnValueRange<(doris::PrimitiveType)37>, doris::ColumnValueRange<(doris::PrimitiveType)15>, doris::ColumnValueRange<(doris::PrimitiveType)10>, doris::ColumnValueRange<(doris::PrimitiveType)23>, doris::ColumnValueRange<(doris::PrimitiveType)11>, doris::ColumnValueRange<(doris::PrimitiveType)25>, doris::ColumnValueRange<(doris::PrimitiveType)12>, doris::ColumnValueRange<(doris::PrimitiveType)26>, doris::ColumnValueRange<(doris::PrimitiveType)20>, doris::ColumnValueRange<(doris::PrimitiveType)2>, doris::ColumnValueRange<(doris::PrimitiveType)19>, doris::ColumnValueRange<(doris::PrimitiveType)28>, doris::ColumnValueRange<(doris::PrimitiveType)29>, doris::ColumnValueRange<(doris::PrimitiveType)30>, doris::ColumnValueRange<(doris::PrimitiveType)35>>>>> const*, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext>>> const&, doris::TupleDescriptor const*, doris::RowDescriptor const*, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, int, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, int>>> const*, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext>>> const*, std::unordered_map<int, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext>>>, std::hash<int>, std::equal_to<int>, std::allocator<std::pair<int const, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext>>>>>> const*, std::shared_ptr<doris::vectorized::TableSchemaChangeHelper::Node>, bool) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:318:5 #5 0x55c00cceda85 in doris::PushBrokerReader::_get_next_reader() /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/push_handler.cpp:671:39 #6 0x55c00cce72df in doris::PushBrokerReader::next(doris::vectorized::Block*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/push_handler.cpp:434:9 #7 0x55c00cce27b3 in doris::PushHandler::_convert_v2(std::shared_ptr<doris::Tablet>, std::shared_ptr<doris::Rowset>*, std::shared_ptr<doris::TabletSchema>, doris::PushType) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/push_handler.cpp:293:31 #8 0x55c00ccdd92d in doris::PushHandler::_do_streaming_ingestion(std::shared_ptr<doris::Tablet>, doris::TPushReq const&, doris::PushType, std::vector<doris::TTabletInfo, std::allocator<doris::TTabletInfo>>*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/push_handler.cpp:199:11 #9 0x55c00ccd96f3 in doris::PushHandler::process_streaming_ingestion(std::shared_ptr<doris::Tablet>, doris::TPushReq const&, doris::PushType, std::vector<doris::TTabletInfo, std::allocator<doris::TTabletInfo>>*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/push_handler.cpp:94:11 #10 0x55c00cccff9c in doris::EngineBatchLoadTask::_push(doris::TPushReq const&, std::vector<doris::TTabletInfo, std::allocator<doris::TTabletInfo>>*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/task/engine_batch_load_task.cpp:293:28 #11 0x55c00cccbbc3 in doris::EngineBatchLoadTask::_process() /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/task/engine_batch_load_task.cpp:249:18 #12 0x55c00ccc6851 in doris::EngineBatchLoadTask::execute() /home/zcp/repo_center/doris_branch-3.1/doris/be/src/olap/task/engine_batch_load_task.cpp:83:22 #13 0x55c0090293eb in doris::push_callback(doris::StorageEngine&, doris::TAgentTaskRequest const&) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/agent/task_worker_pool.cpp:1880:31 #14 0x55c008ff6554 in doris::PriorTaskWorkerPool::normal_loop() /home/zcp/repo_center/doris_branch-3.1/doris/be/src/agent/task_worker_pool.cpp:687:9 #15 0x55c00db29237 in doris::Thread::supervise_thread(void*) /home/zcp/repo_center/doris_branch-3.1/doris/be/src/util/thread.cpp:498:5 #16 0x7fa0651f2ac2 in start_thread nptl/pthread_create.c:442:8 #17 0x7fa06528484f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 ```
What problem does this PR solve?
Problem Summary:
Release note
The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in #52232.
However,
jni_readerhas not yet implemented the number of rows that only contain the storage to be read.Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)