-
Notifications
You must be signed in to change notification settings - Fork 506
ORC-1525: Fix bad read in RleDecoderV2::readByte
#1645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR aims to fix apache#1640 by resetting `BooleanRleEncoderImpl::current` and `BooleanRleEncoderImpl::bitsRemained` when suppress As apache#1640 suppress no null present stream leaves dirty data of BooleanRleEncoderImpl::current and BooleanRleEncoderImpl::bitsRemained, which will be flush to next stripe's present stream if it has some null values. I hava add a test testSuppressPresentStreamInPreStripe, which will construct a orc file with two stripe, the first stripe has no null value and seconds stripe has some null values. The constructed orc file writer have some dirty data in BooleanRleEncoderImpl for present stream. In the test I have add check for read ok and read result is same as write. Closes apache#1640 . Closes apache#1641 from hoffermei/present_supress_bugfix. Lead-authored-by: hoffermei <meihaifeng.hust@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Member
Author
|
@hoffermei @dongjoon-hyun PTAL |
guiyanakuang
approved these changes
Nov 2, 2023
Member
guiyanakuang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
dongjoon-hyun
approved these changes
Nov 2, 2023
Member
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you so much!
dongjoon-hyun
pushed a commit
that referenced
this pull request
Nov 2, 2023
### What changes were proposed in this pull request? This PR aims to fix #1640 by resetting `BooleanRleEncoderImpl::current` and `BooleanRleEncoderImpl::bitsRemained` when suppress ### Why are the changes needed? As #1640 suppress no null present stream leaves dirty data of BooleanRleEncoderImpl::current and BooleanRleEncoderImpl::bitsRemained, which will be flush to next stripe's present stream if it has some null values. ### How was this patch tested? I hava add a test testSuppressPresentStreamInPreStripe, which will construct a orc file with two stripe, the first stripe has no null value and seconds stripe has some null values. The constructed orc file writer have some dirty data in BooleanRleEncoderImpl for present stream. In the test I have add check for read ok and read result is same as write. Closes #1640 . Closes #1645 from wgtmac/branch-1.8. Authored-by: hoffermei <meihaifeng.hust@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Member
|
Merged to branch-1.8. |
morningman
pushed a commit
to apache/doris
that referenced
this pull request
Jun 18, 2025
… Decompress zlib by libdeflate. (#51775) ### What problem does this PR solve? Problem Summary: ### Release note 1. Cherry-pick ORC-1525 to fix bad read in RleDecoderV2::readByte. - third-party PR: apache/doris-thirdparty#322 - orc issue: https://issues.apache.org/jira/browse/ORC-1525 - orc PR: apache/orc#1645 2. Decompress zlib by libdeflate.
22 tasks
This was referenced Jul 10, 2025
Merged
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR aims to fix #1640 by resetting
BooleanRleEncoderImpl::currentandBooleanRleEncoderImpl::bitsRemainedwhen suppressWhy are the changes needed?
As #1640 suppress no null present stream leaves dirty data of BooleanRleEncoderImpl::current and BooleanRleEncoderImpl::bitsRemained, which will be flush to next stripe's present stream if it has some null values.
How was this patch tested?
I hava add a test testSuppressPresentStreamInPreStripe, which will construct a orc file with two stripe, the first stripe has no null value and seconds stripe has some null values. The constructed orc file writer have some dirty data in BooleanRleEncoderImpl for present stream. In the test I have add check for read ok and read result is same as write.
Closes #1640 .