Remove bad graph logic in BlockwiseIO#6933
Conversation
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @rjzamora! Just for my own understanding this, we only need blockwise_token(0) here because PackedFunctionCall gets a tuple of arguments at run time?
Right - This is my understanding. The number of dummy arguments depends on the number of inputs to the underlying function, and BlockwiseIO should always have a single input (the tuple that you mentioned). |
jrbourbeau
left a comment
There was a problem hiding this comment.
Gotcha, thanks. And, so far, ninds has never been >1 since BlockwiseIO was only utilized for DataFrames.
Also, @ian-r-rose is trying out this fix over in #6931
Sounds good! Note that I locally added this change to that PR and seem to get correct behavior with a series of subsequent blockwise operations: |
|
toolz.curry unfortunately hides errors in some situations. We might
consider not using it, and instead just defining a few extra functions.
I'm guessing that this is coming from the wrap.py module. The curried
functions are only used a couple of times, so the magic here probably isn't
necessary.
I suspect that after a brief removal of toolz.curry a more informative
error message will arise.
…On Tue, Dec 8, 2020 at 10:21 AM Ian Rose ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In dask/blockwise.py
<#6933 (comment)>:
> + # Leave out `new_axes` in key
+ # TODO: Check that this makes sense
+ sz = len([i for i in self.output_indices if i not in self.new_axes])
+ io_key = (self.io_name,) + tuple([k[i + 1] for i in range(sz)])
Thanks for looking into it @rjzamora <https://github.com/rjzamora>. This
seems to help with the original new-axes issue:
def f(x):
return x[:, None] * np.ones((1, 7))
x = da.ones(5, chunks=2)y = da.blockwise(
f, "aq", x, "a", new_axes={"q": 7}, concatenate=True, dtype=x.dtype
)y.compute()
However, it fails again with the same operation transposed (again derived
from the test suite):
def f(x):
return x[None, :] * np.ones((7, 1))
x = da.ones(5, chunks=2)y = da.blockwise(
f, "qa", x, "a", new_axes={"q": 7}, concatenate=True, dtype=x.dtype
)y.compute()
<ipython-input-3-988e4cf91f2a> in f(x)
1 def f(x):
----> 2 return x[None, :] * np.ones((7, 1))
3
4 x = da.ones(5, chunks=2)
5 y = da.blockwise(
TypeError: 'cytoolz.functoolz.curry' object is not subscriptable
I'm puzzled as to what is going on such that the blockwise f(x) is
getting fed a curry object rather than an array object, and how it
relates to your logic in 4a90128
<4a90128>.
Could you elaborate a bit?
For what it's worth, these issues seem to disappear when skipping the blockwise
fusion
<https://github.com/dask/dask/blob/3a08bb32e736d51a758281a3abc0fc0dbec04efc/dask/array/optimization.py#L46>
step while optimization.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#6933 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTGFW2AQQJMJDDNKOXLSTZVAXANCNFSM4UM7DGDQ>
.
|
dask/blockwise.py
Outdated
| # TODO: Check that this makes sense | ||
| sz = len([i for i in self.output_indices if i not in self.new_axes]) | ||
| io_key = (self.io_name,) + tuple([k[i + 1] for i in range(sz)]) | ||
| io_key = (self.io_name,) + tuple( |
There was a problem hiding this comment.
👍 Awesome, thanks, making progress. This fixes the above issue.
|
@ian-r-rose's latest find demonstrates that two or more |
|
@ian-r-rose - The latest change here revises the |
|
@jrbourbeau - It may be best to avoid adding any more "fixes" to this PR. Do you have thoughts on getting this in and opening follow-up PRs if working through #6791 and #6931 exposes other problems? |
jrbourbeau
left a comment
There was a problem hiding this comment.
Happy to merge this in and continue updating as needed
Co-authored-by: James Bourbeau <jrbourbeau@users.noreply.github.com>
ian-r-rose
left a comment
There was a problem hiding this comment.
@jrbourbeau - It may be best to avoid adding any more "fixes" to this PR. Do you have thoughts on getting this in and opening follow-up PRs if working through #6791 and #6931 exposes other problems?
I'm happy to move future work on this into #6931 (there are still some cases we will need to track down, I'll post some minimal examples to that PR). @rjzamora feel free to push commits to that branch, though be aware that I've been rebasing regularly.
| # If we are not getting a `dsk` input, there must be a single | ||
| # element in `io_deps` | ||
| if len(io_deps) != 1: | ||
| raise ValueError( |
There was a problem hiding this comment.
Right - The case where we use the BlockwiseIO constructor without an intial dsk definition could probably handle the creation of multiple subgraphs. However, I couldn't think of any reason why we would need to support.
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @rjzamora! I pushed a small commit to make _inject_io_tasks return the updated graph which I think makes things a bit more clear to read -- hope that's okay. Will merge after CI passes
Sounds good @jrbourbeau - Thanks! It seems that the |

Fixes incorrect logic being used to build
dskin theBlockwiseIOconstructor.Currently, the number of dimensions is used to determine the number of dummy arguments. However, the number of dummy argument should always be one dummy argument for IO.
Thanks to @ian-r-rose for pointing out this problem in #6931
black dask/flake8 dask