Skip to content

full & full_like: error on non-scalar fill_value#6129

Merged
jcrist merged 7 commits intodask:masterfrom
Huite:master
Jun 2, 2020
Merged

full & full_like: error on non-scalar fill_value#6129
jcrist merged 7 commits intodask:masterfrom
Huite:master

Conversation

@Huite
Copy link
Copy Markdown
Contributor

@Huite Huite commented Apr 23, 2020

Fixes #6107

  • Tests added / passed
  • Passes black dask / flake8 dask

Because it touches the docstrings, I also ran make html, I didn't come across any issues.

I also found two nits to pick -- I think -- in the docstring:

First, the docstring template of wrap mentions:

Follows the signature of %(name)s exactly except that it also requires a keyword argument chunks=(...)

The chunks argument is optional, not required. Maybe reword it to:

"... it also features an optional keyword argument chunks=(...)"

Second: technically another keyword argument is supported, name.

For this function:

def _parse_wrap_args(func, args, kwargs, shape):

name is explicitly extracted, if present, generated otherwise:

dask/dask/array/wrap.py

Lines 29 to 33 in f58b955

name = kwargs.pop("name", None)
name = name or funcname(func) + "-" + tokenize(
func, shape, chunks, dtype, args, kwargs
)

So I think I'd have to add it to the function arguments.

Or is this form preferable?

def full(shape, fill_value, *args, **kwargs):

@TomAugspurger
Copy link
Copy Markdown
Member

TomAugspurger commented Apr 23, 2020

Thanks. Noting that chunks and name are both optional arguments would be welcome. Do you want to do that it this pull request?

Huite and others added 2 commits April 25, 2020 12:19
Co-Authored-By: Tom Augspurger <TomAugspurger@users.noreply.github.com>
@Huite
Copy link
Copy Markdown
Contributor Author

Huite commented Apr 25, 2020

I just pushed another commit, it turns out the name argument wasn't supported by the the *like functions, the reason being that these are explicitly defined here:

def empty_like(a, dtype=None, chunks=None):

The wrapped forms of *like are not public, as far as I can tell:

grep -Hrn "from .wrap"
core.py:4627:    from .wrap import empty
creation.py:25:from .wrap import empty, ones, zeros, full
rechunk.py:24:from .wrap import empty
reductions.py:20:from .wrap import zeros, ones
routines.py:20:from .wrap import ones
__init__.py:220:    from .wrap import ones, zeros, empty, full

They are used internally:

grep -Hrn "wrap\."
array/core.py:3490:            return wrap.empty_like(meta, shape=shape, chunks=shape, dtype=meta.dtype)
array/core.py:4277:            return wrap.empty_like(meta, shape=shape, chunks=shape, dtype=meta.dtype)
array/overlap.py:354:        c = wrap.full_like(
array/overlap.py:509:                empty = wrap.empty_like(

But appear to take a meta numpy array as their first argument.

Additionally, the *like functions in creation all follow this pattern:

    a = asarray(a, name=False)
    return ones(
        a.shape,
        dtype=(dtype or a.dtype),
        chunks=(chunks if chunks is not None else a.chunks),
   )

It seems straightforward to include the a = asarray(a, name=False) line in the wrapping functions.

Currently, the *like functions in creation also do not support the memory order keyword, while the non-like functions do:

In [8]: a = da.empty((10, 10), order="F")

In [9]: a.compute().flags
Out[9]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False

So I'm a little confused:

  • Maybe rename the wrap.*like functions that are only used internally? (e.g. like_meta?)
  • Generate public *like functions in wrap, and remove the repetition in creation? Due to *args, **args they would also support all numpy keywords like order, although not "K" and "A" options if a is a dask.array? https://docs.scipy.org/doc/numpy/reference/generated/numpy.full_like.html
  • Alternatively, just make everything explicit in creation for all the wrapped functions, copy and edit the numpy docstrings. Taking the argument check of full into accoumt, 5/8 wrapped functions already have explicit definitions (full, full_like, zeros_like, ones_like, empty_like).
  • Adding order would be a breaking change for *like functions, as I think you'd want the numpy argument first, dask argument (name, chunks) last.

@TomAugspurger
Copy link
Copy Markdown
Member

I think we'd be OK with adding an order argument to the public da.*_like methods.

It seems straightforward to include the a = asarray(a, name=False) line in the wrapping functions.

That's probably fine, assuming the tests pass.

I'm less sure about your other suggestions / questions. There's a bit of repetition but it seems to be working ok.

@Huite
Copy link
Copy Markdown
Contributor Author

Huite commented Apr 28, 2020

Yeah, you're right. The latest commits are just the minimal changes necessary, as well as some minor polishes.

@jsignell
Copy link
Copy Markdown
Member

@Huite is this good to go? Maybe with just one more look over from @TomAugspurger?

Copy link
Copy Markdown
Member

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks OK. @jcrist do you have a preference on order between this and #6276?

@jcrist
Copy link
Copy Markdown
Member

jcrist commented Jun 2, 2020

Lets merge this, and I'll cleanup whatever else in #6276. Easier for me to fix merge conflicts than ask others to.

@jcrist jcrist merged commit 5100e8d into dask:master Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expected behavior of dask.array.full() for non-scalar fill values?

4 participants