Skip to content

Consistent use of ordering by task name#3271

Merged
mrocklin merged 6 commits intodask:masterfrom
mrocklin:nearest-neighbor
Mar 19, 2018
Merged

Consistent use of ordering by task name#3271
mrocklin merged 6 commits intodask:masterfrom
mrocklin:nearest-neighbor

Conversation

@mrocklin
Copy link
Member

@mrocklin mrocklin commented Mar 11, 2018

dask

We break ties in task ordering by using the task key itself. Previously we did this inconsistently due to using sorted(reverse=True/False) in different situations. This cause inconsistent ordrering in general and pathological ordering in some situations (like nearest neighbor computations).

We now use two different comparators for string comparison.

  • Tests added / passed
  • Passes flake8 dask
  • Fully documented, including docs/source/changelog.rst for all changes
    and one of the docs/source/*-api.rst files for new API

@mrocklin
Copy link
Member Author

It also looks like we need to drop the initial call to sorted on root nodes. pangeo-data/pangeo#150 (comment)

@mrocklin mrocklin changed the title Add failing test for nearest-neighbor scheduling Consistent use of ordering by task name Mar 18, 2018
@mrocklin
Copy link
Member Author

Merging this soon if there are no comments

Copy link
Member

@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

dsk = dict(w.__dask_graph__())
o = order(dsk)
# from dask import visualize
# visualize(dsk, color='order', filename='dask.pdf', node_attr={'penwidth': '6'})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants