-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[Data] Support general Arrow ExtensionTypes #51959
Description
Description
Currently, (Py)Arrow extension types are not generally supported.
For example the built-in FixedShapeTensorType is not supported, because it extends the BaseExtensionArray, not the ExtensionArray. Moreover, the deserialization logic does not work for all arrays.
For example, if an ExtensionArray has a FixedSizeListType as storage ( as is the case with the built-in FixedShapeTensorType), the payload is deserialized into having a child with type scalar Array, so that neither pa.Array.from_buffers nor pa.ExtensionArray.from_storage works.
One could imagine a plugin system, in which uses can register their own ExtensionTypes which custom (de)serialization logic.
Use case
I have a pre-existing code-base that heavily uses tensor types that are different from the types in ray/air/util/tensor_extensions/arrow.py. Adopting Ray into this code base is made difficult this way.