Skip to content

GH-15173: [C++][Parquet] Fixing ByteStreamSplit Standard broken#34140

Merged
wjones127 merged 3 commits intoapache:mainfrom
mapleFU:parquet/fix-byte-stream-split
Feb 17, 2023
Merged

GH-15173: [C++][Parquet] Fixing ByteStreamSplit Standard broken#34140
wjones127 merged 3 commits intoapache:mainfrom
mapleFU:parquet/fix-byte-stream-split

Conversation

@mapleFU
Copy link
Copy Markdown
Member

@mapleFU mapleFU commented Feb 11, 2023

Rationale for this change

This patch fixes bugs in BYTE_STREAM_SPLIT. It enforce the parquet ByteStreamSplit not to use padding, and force using sizeof(data) / sizeof(T) as num-of-values.

What changes are included in this PR?

  1. Update num-of-values in ByteStreamSplit
  2. Add testing

Are these changes tested?

Yes

Are there any user-facing changes?

No

@mapleFU mapleFU requested a review from wjones127 as a code owner February 11, 2023 08:15
@github-actions
Copy link
Copy Markdown

@mapleFU
Copy link
Copy Markdown
Member Author

mapleFU commented Feb 11, 2023

@pitrou @wgtmac @wjones127 PTAL

@mapleFU mapleFU force-pushed the parquet/fix-byte-stream-split branch from 7fff3fb to 3e6550c Compare February 11, 2023 08:34
Copy link
Copy Markdown
Member

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just one suggestion on the exception description.

@mapleFU
Copy link
Copy Markdown
Member Author

mapleFU commented Feb 17, 2023

Comment fixed

@wjones127 wjones127 merged commit c31fb46 into apache:main Feb 17, 2023
@mapleFU mapleFU deleted the parquet/fix-byte-stream-split branch February 17, 2023 16:12
@ursabot
Copy link
Copy Markdown

ursabot commented Feb 18, 2023

Benchmark runs are scheduled for baseline = 8e5e438 and contender = c31fb46. c31fb46 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.46% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.26% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.22% ⬆️0.03%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] c31fb465 ec2-t3-xlarge-us-east-2
[Failed] c31fb465 test-mac-arm
[Finished] c31fb465 ursa-i9-9960x
[Finished] c31fb465 ursa-thinkcentre-m75q
[Finished] 8e5e438d ec2-t3-xlarge-us-east-2
[Failed] 8e5e438d test-mac-arm
[Finished] 8e5e438d ursa-i9-9960x
[Finished] 8e5e438d ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

pitrou pushed a commit that referenced this pull request May 7, 2024
…amSplitDecoder (#41565)

### Rationale for this change

This problem is raised from  #40094 . Original bug fixed here: #34140 , but this is corrupt in #40094 .

### What changes are included in this PR?

Refine checking

### Are these changes tested?

* [x] Will add

### Are there any user-facing changes?

Bugfix

* GitHub Issue: #41562

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
raulcd pushed a commit that referenced this pull request May 8, 2024
…amSplitDecoder (#41565)

### Rationale for this change

This problem is raised from  #40094 . Original bug fixed here: #34140 , but this is corrupt in #40094 .

### What changes are included in this PR?

Refine checking

### Are these changes tested?

* [x] Will add

### Are there any user-facing changes?

Bugfix

* GitHub Issue: #41562

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…teStreamSplitDecoder (apache#41565)

### Rationale for this change

This problem is raised from  apache#40094 . Original bug fixed here: apache#34140 , but this is corrupt in apache#40094 .

### What changes are included in this PR?

Refine checking

### Are these changes tested?

* [x] Will add

### Are there any user-facing changes?

Bugfix

* GitHub Issue: apache#41562

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Parquet][C++] ByteStreamSplitDecoder broken in presence of nulls

3 participants