[FEA] Add DELTA_BINARY_PACKED decoding support to Parquet reader#13637
Merged
rapids-bot[bot] merged 85 commits intorapidsai:branch-23.10from Aug 23, 2023
Merged
[FEA] Add DELTA_BINARY_PACKED decoding support to Parquet reader#13637rapids-bot[bot] merged 85 commits intorapidsai:branch-23.10from
rapids-bot[bot] merged 85 commits intorapidsai:branch-23.10from
Conversation
…eature/delta_binary
Contributor
|
Pull requests from external contributors require approval from a |
Contributor
|
/ok to test |
nvdbaranec
suggested changes
Aug 17, 2023
Contributor
nvdbaranec
left a comment
There was a problem hiding this comment.
Quick first pass. More to come.
Co-authored-by: nvdbaranec <56695930+nvdbaranec@users.noreply.github.com>
nvdbaranec
suggested changes
Aug 22, 2023
Co-authored-by: nvdbaranec <56695930+nvdbaranec@users.noreply.github.com>
zhuoxunyi
referenced
this pull request
Aug 23, 2023
Fixes: #13864 This PR fixes an issue with `loc` indexer where some special handling needs to be done when `columns` is of type `MultiIndex`. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #13929
nvdbaranec
approved these changes
Aug 23, 2023
galipremsagar
approved these changes
Aug 23, 2023
Contributor
galipremsagar
left a comment
There was a problem hiding this comment.
Approving with a suggestion.
Co-authored-by: GALI PREM SAGAR <sagarprem75@gmail.com>
Contributor
|
/ok to test |
Contributor
|
/merge |
rapids-bot bot
pushed a commit
that referenced
this pull request
Sep 13, 2023
#13637 added a static stream pool object for use by the Parquet reader. This PR expands upon that by: - Moving the stream pool to the `cudf::detail` namespace. - Adding a debugging implementation that always returns the default stream. - Hiding implementation details behind a more streamlined interface. - Using cuda events for synchronization. Authors: - Ed Seidl (https://github.com/etseidl) - Vukasin Milovanovic (https://github.com/vuule) - Mark Harris (https://github.com/harrism) Approvers: - Bradley Dice (https://github.com/bdice) - Vukasin Milovanovic (https://github.com/vuule) - Mark Harris (https://github.com/harrism) URL: #13922
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Part of #13501. This adds support for decoding Parquet pages that are DELTA_BINARY_PACKED.
In addition to adding delta support, this PR incorporates changes introduced in #13622, such as using a mask to determine which decoding kernels to run, and adding parameters to the
page_state_buffers_sstruct to reduce the amount of shared memory used.Checklist