-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Describe the bug, including details regarding any error messages, version, and platform.
After this resize
arrow/cpp/src/arrow/io/buffered.cc
Lines 328 to 332 in 57cb172
| // Increase the buffer size if needed. | |
| if (nbytes > buffer_->size() - buffer_pos_) { | |
| RETURN_NOT_OK(SetBufferSize(nbytes + buffer_pos_)); | |
| DCHECK(buffer_->size() - buffer_pos_ >= nbytes); | |
| } |
It's not guaranteed that buffer_->size() - buffer_pos_ >= nbytes because of the special case here:
arrow/cpp/src/arrow/io/buffered.cc
Lines 302 to 306 in 57cb172
| if (bytes_buffered_ == 0) { | |
| // Special case: we can not keep the current buffer because it does not | |
| // contain any required data. | |
| new_buffer_size = std::min(new_buffer_size, raw_read_bound_ - raw_read_total_); | |
| } else { |
That code path assumes that buffer_pos_ will be reset to zero after the resize, but it is not reset. As a result, in this code:
arrow/cpp/src/arrow/io/buffered.cc
Lines 340 to 343 in 57cb172
| ARROW_ASSIGN_OR_RAISE( | |
| int64_t bytes_read, | |
| raw_->Read(additional_bytes_to_read, | |
| buffer_->mutable_data() + buffer_pos_ + bytes_buffered_)); |
we write to buffer_->mutable_data() + buffer_pos_ + bytes_buffered_, which can access memory beyond the end of the buffer in some cases
I have a repro, but it requires a very large file and a significant amount of internal code, so I cannot post it.
Such diff fixed the issue for me main...chegoryu:arrow:fix-buffered-io
Component(s)
C++