-
-
Notifications
You must be signed in to change notification settings - Fork 747
Description
Hi! Not really an issue, just a query with the aim of avoiding duplicated work. Our use case for distributed involves keys whose values are often lists of or dicts of (lists of, or dicts of, etc) numpy arrays, and unfortunately the current serialization scheme - while fantastic for numpy arrays not embedded in other objects - does not handle these nested structures particularly well.
Now that numpy 1.16 is out, the pickle5 protocol, and its backport (https://github.com/pitrou/pickle5-backport) would seem to provide a very elegant solution to efficiently communicating these kinds of structures by passing the large data arrays as out-of-band data that doesn't need to be embedded into the pickle bytestream.
Is work already underway somewhere on some remote branch to integrate this into distributed? If not, would such a PR be welcome? or should we wait for more experienced hands to tackle it?
Thanks in advance!