[DISCUSSION] Can the scheduler use pickle.dumps?

This question/discussion is related to the ongoing `HighLevelGraph` (HLG) work in dask/dask and dask/distributed (and the clear [need for documentation](https://github.com/dask/dask/issues/7755)).

In order to establish a formal convention for the implementation of `Layer` objects, it is essential that we also establish a clear specification for task-serialization in `Layer.__dask_distributed_pack__` and `__dask_distributed_unpack__`.  That is, we need a clear guideline on what needs to be serialized on the client, and what can be serialized on the scheduler (when the full graph is being materialized).  As pointed out in [dask/dask#7650](https://github.com/dask/dask/issues/7650) , the existing `Layer` implementations approach task serialization in a variety of ways, which may be confusing to would-be HLG-Layer contributors.

In order to design and apply a dogmatic HLG-serialization approach, we need to decisively answer the question:  **Can the scheduler use any serialization logic that leverages pickle?** (Note that it is already clear that the scheduler may not **un**-pickle anything, but it is not clear if the scheduler can rely on pickle **at all**)

cc @madsbk @mrocklin @gjoseph92

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DISCUSSION] Can the scheduler use pickle.dumps? #4890

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[DISCUSSION] Can the scheduler use pickle.dumps? #4890

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions