Extend spec to include dictionaries by mrocklin · Pull Request #1748 · dask/dask

mrocklin · 2016-11-02T18:02:07Z

cc @jcrist @eriknw thought we might want to consider the discussion outside of #1731

eriknw · 2016-11-14T18:20:16Z

This seems like a reasonable extension to the protocol since it's good practice to use a hash/salt when generating keys anyway, which ensures literals won't be clobbered.

It also lets us apply keyword arguments via apply as shown here:

In [1]: import dask
   ...: import distributed
   ...: 
   ...: def f(x, y):
   ...:     return x + y
   ...: 
   ...: d = {
   ...:     'x': 1,  
   ...:     'y': 2,  
   ...:     'f': (apply, f, (), {'x': 'x', 'y': 'y'}),
   ...: }
   ...: 

In [2]: c = distributed.Client()
   ...: c.get(d, 'f')  # new spec
   ...: 
Out[2]: 3

In [3]: dask.get(d, 'f')  # old spec
Out[3]: 'xy'

mrocklin · 2017-04-26T23:20:53Z

I plan to merge this soon if there are no further comments

jcrist · 2017-04-26T23:39:48Z

My gripe in #1731 (comment) still holds. This means that there is no zero-processing overhead way to pass a dictionary, where before there was. Now every dictionary in a task needs to be scanned for any task-like-thing. This might be an imagined fear though, and the overhead for real-world problems may be negligible.

This is also only implemented in distributed currently and some parts of dask (e.g. get_dependencies).

mrocklin · 2017-04-26T23:43:20Z

I think that that is a valid concern. We've had similar issues with lists. For what it's worth I have not experienced performance pain due to traversing dicts when using the distributed scheduler, where this is already enacted.

mrocklin · 2018-05-30T14:03:25Z

This issue came up. @jcrist have your thoughts on this changed at all?

jcrist · 2018-05-30T16:25:22Z

I still think it overcomplicates the spec, but it's worse that the local and distributed schedulers don't have matching behavior (and some users have come to rely on it). Since we can't remove the behavior from distributed, we should update the local schedulers to match. My (possibly not valid) worries are:

This opens the spec up to include other python builtin collections (which I think would be a bad idea)
Since only tasks as dictionary values are supported (which I think is good), users might be confused by the lack of support for tasks as dictionary keys. This is probably a documentation issue.
quote should probably be documented somewhere to allow preventing this behavior, but I don't think should be in the top-level namespace. Its current location in dask.core seems fine to me.

extend spec to include dictionaries

6d367b0

mrocklin mentioned this pull request Nov 2, 2016

Faster array slicing #1731

Merged

sinhrks added the documentation Improve or add to documentation label Dec 6, 2016

mrocklin mentioned this pull request May 30, 2018

dask.get apparent inconsistencies and possible bug #3523

Open

mrocklin closed this Jan 3, 2019

mrocklin deleted the spec-dicts branch January 3, 2019 17:45

bmerry mentioned this pull request Feb 17, 2020

dask.core.quote does not quote dicts #5904

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extend spec to include dictionaries#1748

Extend spec to include dictionaries#1748
mrocklin wants to merge 1 commit intodask:masterfrom
mrocklin:spec-dicts

mrocklin commented Nov 2, 2016

Uh oh!

eriknw commented Nov 14, 2016

Uh oh!

mrocklin commented Apr 26, 2017

Uh oh!

jcrist commented Apr 26, 2017

Uh oh!

mrocklin commented Apr 26, 2017

Uh oh!

mrocklin commented May 30, 2018

Uh oh!

jcrist commented May 30, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

mrocklin commented Nov 2, 2016

Uh oh!

eriknw commented Nov 14, 2016

Uh oh!

mrocklin commented Apr 26, 2017

Uh oh!

jcrist commented Apr 26, 2017

Uh oh!

mrocklin commented Apr 26, 2017

Uh oh!

mrocklin commented May 30, 2018

Uh oh!

jcrist commented May 30, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants