-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[C++] Regression in PlainBooleanDecoder::DecodeArrow #41032
Copy link
Copy link
Closed
Labels
Component: C++Component: ParquetPriority: BlockerMarks a blocker for the releaseMarks a blocker for the releaseType: bug
Milestone
Description
Describe the bug, including details regarding any error messages, version, and platform.
While looking through another PR, I noticed that we recently introduced a bug in PlainBooleanDecoder::DecodeArrow.
Apparently the tests are not thorough enough to detect the issue.
A possible fix is the following patch:
diff --git a/cpp/src/parquet/encoding.cc b/cpp/src/parquet/encoding.cc
index 6e93b49339..a6e60aa012 100644
--- a/cpp/src/parquet/encoding.cc
+++ b/cpp/src/parquet/encoding.cc
@@ -1208,7 +1208,7 @@ int PlainBooleanDecoder::DecodeArrow(
BitBlockCounter bit_counter(valid_bits, valid_bits_offset, num_values);
int64_t value_position = 0;
int64_t valid_bits_offset_position = valid_bits_offset;
- int64_t previous_value_offset = 0;
+ int64_t previous_value_offset = total_num_values_ - num_values_;
while (value_position < num_values) {
auto block = bit_counter.NextWord();
if (block.AllSet()) {
@@ -1224,8 +1224,7 @@ int PlainBooleanDecoder::DecodeArrow(
} else {
for (int64_t i = 0; i < block.length; ++i) {
if (bit_util::GetBit(valid_bits, valid_bits_offset_position + i)) {
- bool value = bit_util::GetBit(
- data_, total_num_values_ - num_values_ + previous_value_offset);
+ bool value = bit_util::GetBit(data_, previous_value_offset);
builder->UnsafeAppend(value);
previous_value_offset += 1;
} else {Component(s)
C++, Parquet
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Component: C++Component: ParquetPriority: BlockerMarks a blocker for the releaseMarks a blocker for the releaseType: bug