-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[bugfix](be_metrics) update scan bytes metric correctly #52232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
TPC-H: Total hot run time: 33866 ms |
TPC-DS: Total hot run time: 185065 ms |
ClickBench: Total hot run time: 29.06 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
1. unify scanbytes and scan rows definition in audit log,doris metrics,profile。 scan bytes is the uncompressed data read from the file + from the page cache. scan rows is the raw rows read from the file. 2. add update realtime counter interface in scanner, so other scanner type could implement this interface to get these counters realtime. 3. the uncompressed data read in page io is wrong. 4. cputimer counter is not updated correctly. DO NOT merge this pr to 3.0 or 3.1, because there is some behavior change for scan bytes. If user has config workload group policy or some monitor on scan bytes metric, there will be something wrong.
### What problem does this PR solve? 1. unify scanbytes and scan rows definition in audit log,doris metrics,profile。 scan bytes is the uncompressed data read from the file + from the page cache. scan rows is the raw rows read from the file. 2. add update realtime counter interface in scanner, so other scanner type could implement this interface to get these counters realtime. 3. the uncompressed data read in page io is wrong. 4. cputimer counter is not updated correctly. DO NOT merge this pr to 3.0 or 3.1, because there is some behavior change for scan bytes. If user has config workload group policy or some monitor on scan bytes metric, there will be something wrong.
### What problem does this PR solve? 1. unify scanbytes and scan rows definition in audit log,doris metrics,profile。 scan bytes is the uncompressed data read from the file + from the page cache. scan rows is the raw rows read from the file. 2. add update realtime counter interface in scanner, so other scanner type could implement this interface to get these counters realtime. 3. the uncompressed data read in page io is wrong. 4. cputimer counter is not updated correctly. DO NOT merge this pr to 3.0 or 3.1, because there is some behavior change for scan bytes. If user has config workload group policy or some monitor on scan bytes metric, there will be something wrong.
### What problem does this PR solve? 1. unify scanbytes and scan rows definition in audit log,doris metrics,profile。 scan bytes is the uncompressed data read from the file + from the page cache. scan rows is the raw rows read from the file. 2. add update realtime counter interface in scanner, so other scanner type could implement this interface to get these counters realtime. 3. the uncompressed data read in page io is wrong. 4. cputimer counter is not updated correctly. DO NOT merge this pr to 3.0 or 3.1, because there is some behavior change for scan bytes. If user has config workload group policy or some monitor on scan bytes metric, there will be something wrong.
…53729) ### What problem does this PR solve? Problem Summary: ### Release note The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in #52232. However, - Scan bytes in the external table currently represents the bytes counted by **the top-level File Reader** called by the scan reader layer. - Scan rows represents the number of scanned rows of the underlying storage. The number of scanned rows of parquet/orc does not include the number of rows of skipped page/rowgroup. **Note: However, there is still a problem that `jni_reader` has not yet implemented the number of rows that only contain the storage to be read.**
…pache#53729) Problem Summary: The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in apache#52232. However, - Scan bytes in the external table currently represents the bytes counted by **the top-level File Reader** called by the scan reader layer. - Scan rows represents the number of scanned rows of the underlying storage. The number of scanned rows of parquet/orc does not include the number of rows of skipped page/rowgroup. **Note: However, there is still a problem that `jni_reader` has not yet implemented the number of rows that only contain the storage to be read.**
…pache#53729) Problem Summary: The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in apache#52232. However, - Scan bytes in the external table currently represents the bytes counted by **the top-level File Reader** called by the scan reader layer. - Scan rows represents the number of scanned rows of the underlying storage. The number of scanned rows of parquet/orc does not include the number of rows of skipped page/rowgroup. **Note: However, there is still a problem that `jni_reader` has not yet implemented the number of rows that only contain the storage to be read.**
…pache#53729) ### What problem does this PR solve? Problem Summary: ### Release note The external part is implemented according to the framework defined by the unified audit log, Doris metrics, and scanbytes and scan rows in Profile in apache#52232. However, - Scan bytes in the external table currently represents the bytes counted by **the top-level File Reader** called by the scan reader layer. - Scan rows represents the number of scanned rows of the underlying storage. The number of scanned rows of parquet/orc does not include the number of rows of skipped page/rowgroup. **Note: However, there is still a problem that `jni_reader` has not yet implemented the number of rows that only contain the storage to be read.**
forget update profile counters in update_realtime_counters bug introduced in apache#52232 Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
forget update profile counters in update_realtime_counters bug introduced in apache#52232 Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
forget update profile counters in update_realtime_counters. bug introduced in #52232
…55929) forget update profile counters in update_realtime_counters. bug introduced in apache#52232
…55929) forget update profile counters in update_realtime_counters. bug introduced in apache#52232
What problem does this PR solve?
DO NOT merge this pr to 3.0 , because there is some behavior change for scan bytes. If user has config workload group policy or some monitor on scan bytes metric, there will be something wrong.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)