Faster array slicing by pitrou · Pull Request #1731 · dask/dask

pitrou · 2016-10-31T10:39:02Z

Despite the PR title, this also brings more generic optimizations to dependency computations, and various optimizations.

mrocklin · 2016-10-31T12:25:52Z

dask/core.py

    set(['x'])

-    >>> get_dependencies(dsk, 'z')  # doctest: +SKIP
+    >>> sorted(get_dependencies(dsk, 'z'))  # doctest: +SKIP


Can we lose the doctest: +SKIP?

I opted to remove the sorted() call instead.

mrocklin · 2016-10-31T12:26:37Z

dask/array/slicing.py

+        return ns['expand']
+
+
+def expander(where):


Why separate this into two functions?

Because the memoize decorator needs hashable arguments (which lists are not).

Try @memoize(key=tuple)

Also, should we be concerned about the memoized function holding on to too many inputs?

I don't think so, unless someone has a need for a large number of locations for newaxis in their indices.

Try @memoize(key=tuple)

The signature for the key argument isn't very pretty (it takes a args, kwargs tuple). I wonder if it's not more readable to keep the two-argument variant.

pitrou · 2016-10-31T19:10:08Z

A bit embarassing: this PR makes https://nbviewer.jupyter.org/gist/shoyer/425cb3dc9101c235fe86 slower.
I'll have to take a deeper look.

pitrou · 2016-11-01T11:43:53Z

I think I've fixed the regression. Also optimized inline_functions by using a recursive solution.

pitrou · 2016-11-01T11:58:17Z

I've now found out a way to make _deps() faster in an iterative way.

mrocklin · 2016-11-01T12:00:46Z

Why does creating the new list cause such a performance difference? What differences are you seeing?

pitrou · 2016-11-01T12:27:56Z

The work list avoids both the recursion and the manual stack handling. The number of outer loop iterations is equal to the maximum task nesting level, which is usually very small.

pitrou · 2016-11-01T12:33:55Z

What difference I'm seeing: on

N = 1000
x = da.ones((N, N), chunks=(10, 10), name='x')
z = [x[i] for i in range(N)]

_deps() used to take 700 ms, it now only takes 500 ms.

As a whole, the cost of dependency computation is divided by 3 with this PR on the micro-benchmark above.

pitrou · 2016-11-01T13:15:58Z

Sorry for the churn. I think this PR is now ready for review or merging.

mrocklin · 2016-11-01T13:20:08Z

dask/core.py

+    """
+    if not isinstance(tasks, list):
+        raise TypeError("Please provide a list of tasks")
+    return set(_deps(dsk, tasks))


Given the new simplified structure of get_dependencies is this function still useful?

Yes, because passing a bunch of tasks at once is more efficient than calling _deps() for each one.
(note _deps() can be very fast on trivial entries)

mrocklin · 2016-11-01T13:20:23Z

dask/core.py

+    if as_list:
+        return {k: _deps(dsk, dsk[k]) for k in keys}
+    else:
+        return {k: set(_deps(dsk, dsk[k])) for k in keys}


Same question as above. Do we still need this function?

In the case above I now see that we're sending a list of tasks. In this case though we're just using a dict comprehension. That dict-comprehension could presumably be inlined. I suppose the gains here are based on the type checks and avoiding the extra function call?

Yes, they are. We could further inline the dict comprehension, but at the price of code duplication, so I'm not sure that it's a good idea.

If it's a single line dict comprehension then I would prefer the code duplication. My opinion is that code indirection (having to follow many different functions to find the logic you want) can be as bad as code duplication in some cases.

Ah... I'll take a look.

_deps() is a private API, should we make it public? Use get_dependencies instead (at the price of a small slowdown - cull() seems to become 15% slower)?

Thoughts on removing _deps entirely and folding it into get_dependencies? We could place the burden of arg/task inputs and list/set outputs on the caller.

This can also happen later. Mostly I want to push on the principle of "there is a maintenance cost to having lots of functions that do similar things". If we can have a few of these that compose well with standard language constructs like dictionary comprehensions then I think that dask.core will remain an easy place to get started for newish developers.

Definitely agreed. I'll try to fold _deps into get_dependencies.

mrocklin · 2016-11-01T13:22:15Z

@eriknw some of the performance optimizations to dask.core and dask.optimize might interest you.

pitrou · 2016-11-01T13:39:32Z

Hmm, I'm not sure why 831be05 produced the following failure on AppVeyor:
https://ci.appveyor.com/project/dask-ci/dask/build/1.0.222#L490

Could it be an unrelated intermittent failure? One possible explanation is that empty() returned some NaNs or Infs and allclose() failed.

mrocklin · 2016-11-01T13:40:09Z

Yes, I suspect that this is caused by empty. We should probably replace with random.random

eriknw · 2016-11-01T17:55:30Z

Thanks for the ping, I'll take a look when convenient.

mrocklin · 2016-11-01T18:17:47Z

Two thoughts:

_deps is also used a couple of times within dask/distribued
Thoughts on removing the as_list= keyword parameter and instead having the caller do this expilcitly as in set(get_dependencies(dsk, task))

pitrou · 2016-11-01T18:41:30Z

For distributed, see dask/distributed#616.
as_list feels ok to me right now, but we can revisit it later. Returning a set by default makes more sense because the result is naturally unordered, and in most cases you don't care about duplicates.

mrocklin · 2016-11-01T18:42:28Z

OK

mrocklin · 2016-11-01T18:46:09Z

This seems fine to me. I recommend that we wait a bit for dask/distributed#616 to get in and in case @eriknw has some time for review. Otherwise I plan to merge tomorrow morning (around 18 hours from now)

mrocklin · 2016-11-02T14:23:32Z

.travis.yml

  - pip install git+https://github.com/mrocklin/partd --upgrade
  - pip install git+https://github.com/mrocklin/cachey --upgrade
+  - pip install git+https://github.com/dask/zict --upgrade
+  - pip install git+https://github.com/dask/distributed --upgrade


Why are these necessary?

I'm trying to fix the following failure:
https://travis-ci.org/dask/dask/jobs/172618031#L731
I'm not sure why the build worked previously.

Now I'm getting another similar issue with s3fs: https://travis-ci.org/dask/dask/jobs/172623691#L751
Do you have an idea what may be happening?

s3fs is included as a dependency in the distributed conda package but not in the PyPI package. We keep the PyPI packages lightweight.

eriknw · 2016-11-02T16:07:42Z

dask/core.py

+                new_work += w[1:]
+            elif typ is list:
+                new_work += w
+            elif typ is dict:


Why was this added?

It's not added, it's part of the _deps() algorithm, rewritten iteratively, and folded here.

I see, thanks. I'm confused why we want to do this with dicts. This was added in 05ddd68.

In [20]: d = {'x': 1, 'y': {'a': 'x'}} In [21]: dask.core.get_dependencies(d, 'y') Out[21]: {'x'} In [22]: dask.core.get(d, ['y']) Out[22]: ({'a': 'x'},) In [23]: dask.get(d, ['y']) --------------------------------------------------------------------------- KeyError: 'y'

No idea from me :-)

@eriknw we do this for the distributed scheduler. It ended up being useful, though at the moment I don't particularly recall the original use case.

In [1]: from distributed import Client In [2]: c = Client() In [3]: d = {'x': 1, 'y': {'a': 'x'}} In [4]: c.get(d, 'y') Out[4]: {'a': 1}

I agree the dict solution doesn't look pretty.

Here's a simple, non-recursive helper function that makes the intent explicit and doesn't require changing the spec:

In [35]: def subdask(d): ...: return (dict, (zip, tuple(d.keys()), list(d.values()))) ...: ...: d = { ...: 'a': 1, ...: 'b': 2, ...: 'c': subdask({'a': 'a', 'b': 'b'}), ...: } ...: In [36]: d Out[36]: {'a': 1, 'b': 2, 'c': (dict, (<function zip>, ('a', 'b'), ['a', 'b']))} In [37]: dask.core.get(d, 'c') Out[37]: {'a': 1, 'b': 2}

By the way, I'm assuming this discussion isn't an issue for the current PR, right? :-)

Yeah. The PR lgtm. It keeps the existing behavior with dicts, which we can revisit elsewhere.

mrocklin · 2016-11-02T23:25:46Z

Merging in 12 hours if no comment

mrocklin reviewed Oct 31, 2016

View reviewed changes

mrocklin mentioned this pull request Oct 31, 2016

Cull in array slicing hurts performance for highly nested computations #1732

Closed

pitrou changed the title ~~[WIP] Faster array slicing~~ Faster array slicing Oct 31, 2016

pitrou added 5 commits November 1, 2016 11:24

Faster array slicing

8562e72

Take a bit more shortcuts

46671aa

Appease flake8

5bd2da3

Remove unwanted change in docstring

cc4866f

Optimize inline_functions()

86d3c19

pitrou force-pushed the faster_slicing branch from 011ca23 to 86d3c19 Compare November 1, 2016 11:28

Faster iterative _deps()

2e1e07f

Optimize cull() and friends

6cd5e05

pitrou added 3 commits November 1, 2016 13:41

Appease flake8 again

176dcb1

Add tests for new functions

a5b53c9

Appease flake8 again

831be05

mrocklin reviewed Nov 1, 2016

View reviewed changes

Use deterministic initialization in test

0d2d7e4

Standardize around get_dependencies()

08eb9a1

Merge branch 'master' into faster_slicing

a47a703

pitrou added the array label Nov 1, 2016

pitrou added 2 commits November 2, 2016 15:01

Merge branch 'master' into faster_slicing

579afc0

Install up-to-date distributed in .travis.yml

90ed971

mrocklin reviewed Nov 2, 2016

View reviewed changes

pitrou added 2 commits November 2, 2016 15:49

Try to fix dependencies on Travis

c356999

Avoid pulling dependencies when upgrading distributed and zict from git

5e54694

eriknw reviewed Nov 2, 2016

View reviewed changes

mrocklin mentioned this pull request Nov 2, 2016

Extend spec to include dictionaries #1748

Closed

mrocklin merged commit 1b18829 into dask:master Nov 3, 2016

sinhrks added this to the 0.12.0 milestone Nov 7, 2016

Uh oh!

Conversation

pitrou commented Oct 31, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pitrou commented Oct 31, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pitrou commented Nov 1, 2016

Uh oh!

pitrou commented Nov 1, 2016

Uh oh!

mrocklin commented Nov 1, 2016

Uh oh!

pitrou commented Nov 1, 2016

Uh oh!

pitrou commented Nov 1, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pitrou commented Nov 1, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrocklin commented Nov 1, 2016

Uh oh!

pitrou commented Nov 1, 2016

Uh oh!

mrocklin commented Nov 1, 2016

Uh oh!

eriknw commented Nov 1, 2016

Uh oh!

mrocklin commented Nov 1, 2016

Uh oh!

pitrou commented Nov 1, 2016

Uh oh!

mrocklin commented Nov 1, 2016

Uh oh!

mrocklin commented Nov 1, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

pitrou commented Oct 31, 2016 •

edited

Loading

pitrou commented Oct 31, 2016 •

edited

Loading

pitrou commented Nov 1, 2016 •

edited

Loading