Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #55795

…roupReader (#55795)

### What problem does this PR solve?

Now, Parquet reader don't pass non predicate coloumn's offset index to
`RowGroupReader`.
`create_page_reader` will create `PageReader` rather than
`PageReaderWithOffsetIndex`.
The `PageReader` will parse each page header when `skip_page` and
`next_page_header`. In the case of merging io, read and enlarge
severely.

So, in this PR, we pass the offset index to `RowGroupReader`.

Co-authored-by: liutang123 <liulijia@gmail.com>
@github-actions github-actions bot requested a review from yiguolei as a code owner October 23, 2025 07:35
@Thearas
Copy link
Contributor

Thearas commented Oct 23, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Oct 23, 2025
@Thearas
Copy link
Contributor

Thearas commented Oct 23, 2025

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 30.72 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6ef3e992c1439b7ed7d8da4290a53c6cda496075, data reload: true

query1	0.04	0.03	0.03
query2	0.12	0.05	0.04
query3	0.25	0.07	0.07
query4	1.64	0.11	0.11
query5	0.28	0.26	0.26
query6	1.20	0.65	0.64
query7	0.03	0.02	0.02
query8	0.06	0.04	0.04
query9	0.62	0.52	0.53
query10	0.59	0.59	0.59
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.64	0.61	0.61
query14	0.81	0.86	0.84
query15	0.91	0.87	0.85
query16	0.41	0.39	0.38
query17	1.07	1.09	1.09
query18	0.19	0.19	0.19
query19	1.93	1.87	1.86
query20	0.02	0.01	0.02
query21	15.42	0.95	0.60
query22	0.76	1.26	0.78
query23	14.94	1.40	0.64
query24	6.95	1.41	0.62
query25	0.34	0.09	0.09
query26	0.69	0.17	0.13
query27	0.05	0.05	0.04
query28	9.87	1.38	0.95
query29	12.62	3.93	3.31
query30	0.28	0.12	0.11
query31	2.90	0.65	0.40
query32	3.45	0.58	0.49
query33	3.23	3.23	3.24
query34	17.04	5.82	5.06
query35	5.21	5.12	5.04
query36	0.69	0.51	0.49
query37	0.10	0.06	0.07
query38	0.09	0.04	0.04
query39	0.03	0.03	0.03
query40	0.17	0.14	0.14
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 106.12 s
Total hot run time: 30.72 s

@yiguolei yiguolei merged commit 21a9e98 into branch-4.0 Oct 24, 2025
24 of 27 checks passed
@github-actions github-actions bot deleted the auto-pick-55795-branch-4.0 branch October 24, 2025 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants