Feature Description
The operator is stateless, so we have no reason to reject its use in Python's multiprocess environment.
Problem and Solution
Python uses the pickle format to serialize and deserialize Python objects between multiple processes. If it is a native Python object, it is naturally supported to be converted by pickle. However, as an extension, we must implement it manually.
There are two hooks for the purpose, __setstate__ and __getstate__.
However, the two methods means that a Operator must expose all information that can be used to construct it. Currently, the public interface can't do that.
@Zheaoli Can you provide some ideas to achieve the purpose?
Additional Context
Here are examples in polars:
https://github.com/pola-rs/polars/blob/18786acd8d1eb68fc87982b07ce29ecbae0923f0/crates/polars-python/src/lazyframe/serde.rs#L16-L36
Are you willing to contribute to the development of this feature?
Feature Description
The operator is stateless, so we have no reason to reject its use in Python's multiprocess environment.
Problem and Solution
Python uses the pickle format to serialize and deserialize Python objects between multiple processes. If it is a native Python object, it is naturally supported to be converted by pickle. However, as an extension, we must implement it manually.
There are two hooks for the purpose,
__setstate__and__getstate__.However, the two methods means that a Operator must expose all information that can be used to construct it. Currently, the public interface can't do that.
@Zheaoli Can you provide some ideas to achieve the purpose?
Additional Context
Here are examples in polars:
https://github.com/pola-rs/polars/blob/18786acd8d1eb68fc87982b07ce29ecbae0923f0/crates/polars-python/src/lazyframe/serde.rs#L16-L36
Are you willing to contribute to the development of this feature?