Skip to content

[C++][Parquet] Parquet binary length overflow exception should contain the length of binary #39843

@mapleFU

Description

@mapleFU

Describe the enhancement requested

  template <typename ArrayType>
  void PutBinaryArray(const ArrayType& array) {
    PARQUET_THROW_NOT_OK(::arrow::VisitArraySpanInline<typename ArrayType::TypeClass>(
        *array.data(),
        [&](::std::string_view view) {
          if (ARROW_PREDICT_FALSE(view.size() > kMaxByteArraySize)) {
            return Status::Invalid("Parquet cannot store strings with size 2GB or more");
          }
          PutByteArray(view.data(), static_cast<uint32_t>(view.size()));
          return Status::OK();
        },
        []() { return Status::OK(); }));
  }

The code above works well, however, I think it's better to have a length for view, which would help debugging ( especially when debugging memory problem, a bad string might having a uninit memory and weird string length)

Component(s)

C++, Parquet

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions