Skip to content

[Data] Failure when converting to Arrow from TensorArray #59445

@alexeykudinkin

Description

@alexeykudinkin

What happened + What you expected to happen

Context

  • Pandas uses custom extension types (like Int64Dtype, etc) to support nullable values.
  • Ray Data's TensorArray represents tensors as Numpy tensors and therefore can't support nullable numeric values
  • When converting from Pandas Series into Numpy tensor, it becomes a tensor with dtype="object"
  • When we try to convert it into Arrow's tensor calling pyarrow.from_numpy_dtype fails with Unsupported numpy type 17

Versions / Dependencies

2.52

Reproduction script

NA

Issue Severity

None

Metadata

Metadata

Labels

P0Issues that should be fixed in short orderdataRay Data-related issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions