Skip to content

Add shape=None to *_like() array creation functions#6064

Merged
martindurant merged 19 commits intodask:masterfrom
andersy005:array/shape-arg-in-like-functions
Jun 10, 2020
Merged

Add shape=None to *_like() array creation functions#6064
martindurant merged 19 commits intodask:masterfrom
andersy005:array/shape-arg-in-like-functions

Conversation

@andersy005
Copy link
Copy Markdown
Member

@andersy005 andersy005 commented Apr 3, 2020

  • Tests added / passed
  • Passes black dask / flake8 dask

Towards fixing #4875

@andersy005
Copy link
Copy Markdown
Member Author

Cc @mrocklin, @pentschev

Copy link
Copy Markdown
Member

@mrocklin mrocklin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @andersy005 . I left some minor comments, mostly stylistic.

@andersy005
Copy link
Copy Markdown
Member Author

It looks like the test failures here are unrelated to the changes I made.

Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andersy005! I've restarted the failed CI builds here to see if any test failures persist

@TomAugspurger
Copy link
Copy Markdown
Member

Merged master to hopefully fix the CI stuff.

Copy link
Copy Markdown
Member

@jsignell jsignell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reproduce the failures locally?

@andersy005
Copy link
Copy Markdown
Member Author

Can you reproduce the failures locally?

Yes. All tests in dask/array/tests are passing, and all failing tests are coming from dask/dataframe/tests. I am going to look into whether the changes I made are affecting existing functionality in dask/dataframe

@jsignell
Copy link
Copy Markdown
Member

jsignell commented Jun 9, 2020

I just ran the tests locally and there are genuine failures coming from the dask.dataframe tests. In particular, this test is faililng on most parameter values:

def test_2args_with_array(ufunc, pandas, darray):

I can't quite figure out why yet...

@andersy005
Copy link
Copy Markdown
Member Author

I can't quite figure out why yet...

Me either. I spent some time this morning looking into it, but I still don't know yet the connection between the changes I made and the dask.dataframe. I could use some help in figuring it out :)

@jsignell
Copy link
Copy Markdown
Member

jsignell commented Jun 9, 2020

Oh! I think I have something. I don't think the logic in the helper function is quite right. If shape is None it gets overridden and then chunks ends up being auto. I'll add a suggestion in a sec.

@jsignell
Copy link
Copy Markdown
Member

jsignell commented Jun 9, 2020

The test failures are related to the empty_like function accepting a shape kwarg. Maybe @jcrist has some ideas about what's going on here since he's been working on this area in #6276

@jsignell
Copy link
Copy Markdown
Member

jsignell commented Jun 9, 2020

Ok. So I think what is happening is in ufunc we end up trying to generate a meta and when we do that we call empy_like_safe:

dask/dask/dataframe/core.py

Lines 4756 to 4763 in dd682cc

parts = [
d._meta
if _is_broadcastable(d)
else empty_like_safe(d, (), dtype=d.dtype)
if isinstance(d, Array)
else d._meta_nonempty
for d in dasks
]

which in turn goes and calls np.empty_like(a, .. so now that this PR makes things NEP18, the result of that is a dask array rather than a numpy array. And that causes problems because metas are supposed to be non-dask objects.

So I think this is the only change that is needed:

diff --git a/dask/dataframe/core.py b/dask/dataframe/core.py
index 72851b66..f059bb6c 100644
--- a/dask/dataframe/core.py
+++ b/dask/dataframe/core.py
@@ -4756,7 +4756,7 @@ def elemwise(op, *args, **kwargs):
         parts = [
             d._meta
             if _is_broadcastable(d)
-            else empty_like_safe(d, (), dtype=d.dtype)
+            else np.empty((), dtype=d.dtype)
             if isinstance(d, Array)
             else d._meta_nonempty
             for d in dasks

@jsignell
Copy link
Copy Markdown
Member

jsignell commented Jun 9, 2020

With that diff the tests pass for me locally :)

@andersy005
Copy link
Copy Markdown
Member Author

With that diff the tests pass for me locally :)

Many thanks for hunting this down!

@jsignell
Copy link
Copy Markdown
Member

Hooray!! @dask/maintenance I think this one is ready!

@martindurant martindurant merged commit 4eef8e2 into dask:master Jun 10, 2020
@andersy005 andersy005 deleted the array/shape-arg-in-like-functions branch June 10, 2020 13:36
@jakirkham
Copy link
Copy Markdown
Member

Thanks @andersy005 for the PR and everyone for the reviews! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants