Key and Task classes

I think it would be worth considering adding optional light-weight classes to represent keys and tasks in a dask graph. These would complement the existing `dask.core.quote` for literals.

This would allow for much clearer intent when creating dask graphs, and better error messages when things go wrong (e.g., for #2298), because dask could know unambiguously what an object is intended to represent without needing to guess about what it is. For example, if a key is not found, dask could raise an error instead of using it as a literal.

These could be simple `tuple` subclasses, e.g.,
```
class Key(tuple):
  __slots__ = ()
  
  def __new__(cls, *args):
    return tuple.__new__(Key, args)

  def __repr__(self):
    contents = repr(tuple(self))
    if len(self) == 1:
      contents = contents[:-len(',)')] + ')'
    return 'Key{}'.format(contents)
```

The `Task` class could automatically handle `**kwargs` in the proper fashion, e.g., `Task(pd.read_csv, filename, sep='\t')`.

This is more verbose than using Python builtins, but not onerously so. E.g., adapting the ["Custom Graphs" example](http://dask.pydata.org/en/latest/custom-graphs.html) from the docs:
```python
from dask import Task, Key

...
dsk = {'load-1': Task(load, 'myfile.a.data'),
       'load-2': Task(load, 'myfile.b.data'),
       'load-3': Task(load, 'myfile.c.data'),
       'clean-1': Task(clean, Key('load-1')),
       'clean-2': Task(clean, Key('load-2')),
       'clean-3': Task(clean, Key('load-3')),
       'analyze': Task(analyze, [Key('clean-%d') % i for i in [1, 2, 3]]),
       'store': Task(store, Key('analyze'))}
```

Possibly, we would want a "strict evaluation" mode that requires all tasks and keys to be wrapped in the appropriate classes, and switches the default interpretation for everything else to be a literal. Think of this as "strong typing" for dask.

I think this would be really valuable for library code, such as the existing dask collections.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Key and Task classes #2299

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Key and Task classes #2299

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions