Skip to content

feat(parquet): Add support for Page Indexes#223

Merged
zeroshade merged 6 commits intoapache:mainfrom
zeroshade:page-index-parquet
Jan 9, 2025
Merged

feat(parquet): Add support for Page Indexes#223
zeroshade merged 6 commits intoapache:mainfrom
zeroshade:page-index-parquet

Conversation

@zeroshade
Copy link
Copy Markdown
Member

Rationale for this change

Adds support for writing and retrieving Page Indexes from parquet files

What changes are included in this PR?

Added new writer properties to enable writing PageIndexes either as default or for specific columns.
Added RowGroupPageIndexReader, ColumnIndexReader, OffsetIndexReader to access page index information.

Are these changes tested?

Yes, extensive unit tests have been added including round trip tests and encryption/decryption tests.

Are there any user-facing changes?

no. Just new functions available.

closes #33

Copy link
Copy Markdown
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry, I couldn't review this carefully... I took a look at all changes but I don't have enough knowledge of the Parquet specification and Go. And this change is too large for me... I hope that others can review this.)

@zeroshade zeroshade requested a review from pitrou December 23, 2024 22:19
@zeroshade
Copy link
Copy Markdown
Member Author

Thanks for taking a look at all @kou!

@wgtmac and @pitrou I specifically tagged you both for review here given your Parquet experience hoping that one or both of you would be able to take a look at this from that perspective. Much of this is based on the way that Parquet C++ handles Page Indexes

@zeroshade
Copy link
Copy Markdown
Member Author

I'd like to get this in before I kick off the next release in the next week or so. Would anyone be free to do a review soon?

@zeroshade
Copy link
Copy Markdown
Member Author

Thanks for the review @lidavidm!!

@zeroshade zeroshade merged commit 365a84a into apache:main Jan 9, 2025
@zeroshade zeroshade deleted the page-index-parquet branch January 9, 2025 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Go][Parquet] Does this implementation support page indexes?

3 participants