Speed up normalize_chunks for common case#10579
Conversation
|
bump @dask/dask-dev |
|
@jrbourbeau this would really help speed up opening the 15TB coastal ocean dataset USGS has been working on. From 20s to 6s! It's headed for the AWS public dataset program! |
|
Nice, thanks @martindurant -- taking a look now |
dask/array/core.py
Outdated
| if nonans or isinstance(sum(sum(_) for _ in chunks), int): | ||
| return tuple(tuple(_) for _ in chunks) |
There was a problem hiding this comment.
Non-blocking nit: The _ usage here is correct, but looks a little unusual. Any reason we can't do this, like we do elsewhere?
| if nonans or isinstance(sum(sum(_) for _ in chunks), int): | |
| return tuple(tuple(_) for _ in chunks) | |
| if nonans or isinstance(sum(sum(c) for c in chunks), int): | |
| return tuple(tuple(c) for c in chunks) |
dask/array/core.py
Outdated
|
|
||
| nonans = None | ||
| if chunks and shape is not None: | ||
| nonans = all(isinstance(c, int) for c in chunks) |
There was a problem hiding this comment.
Is there a reason to use an int check instead of not math.isnan(c)? The nonans name seems to be slightly different than the code that's in this line
There was a problem hiding this comment.
The c in this case can be tuples - we are checking specifically for individual ints.
Also, it's a bit faster eve
In [9]: l = [1] * 100
In [10]: %timeit all(isinstance(x, int) for x in l)
3.41 µs ± 14.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [11]: %timeit all(not math.isnan(x) for x in l)
5.55 µs ± 8.38 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
dask/array/core.py
Outdated
| "Got chunks=%s, shape=%s" % (chunks, shape) | ||
| ) | ||
|
|
||
| if nonans or isinstance(sum(sum(_) for _ in chunks), int): |
There was a problem hiding this comment.
What case is the or isinstance(sum(sum(_) for _ in chunks), int) part covering that's not already covered by the if nonans portion of this condition?
There was a problem hiding this comment.
I'm surprised this double summation isn't slow in the many data variables case
There was a problem hiding this comment.
It's the nonans (now renamed allints) which is the very fast path.
However, the sum is still much faster. In the case there are nans, it would be slightly slower.
In [16]: chunks = ((1, 2, 3, 4), ) * 10
In [18]: %timeit tuple(tuple(int(x) if not math.isnan(x) else np.nan for x in c) for c in chunks)
5.15 µs ± 14.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [19]: %timeit isinstance(sum(sum(_) for _ in chunks), int);tuple(tuple(ch) for ch in chunks)
1.34 µs ± 1.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
There was a problem hiding this comment.
What case is the or isinstance(sum(sum(_) for _ in chunks), int) part covering
This is for chunks like ((2, 2, 2), (3, 3, 3)), i.e., tuple of tuple of only ints.
As you see in the original snakeviz, this function is called three times under from_array; we can speed up inputs like (2, 3) (original zarr-like chunksize) and ((2, 2, 2), (3, 3, 3)) (already normalised). I haven't dug to see if there's a way to avoid calling normalize at all, since the code here is enough to have it drop out of my profiling.
normalize_chunks for common case
|
@jrbourbeau , thanks for the review I implemented your suggestions and answered questions. |
|
Removing my stuff and simply using on the last line would be much better, but there are two tests specifically passing float("nan") to check the current behaviour. |
|
@jrbourbeau are you satisfied with the approach and justifications for the approach or do you think it needs more work? |
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks @martindurant! This is in
In certain xarray zarr/kerchunk reads with many variables, a considerable amount of time is spent in dask.array's normalize chunks (see snakeviz output)
For the case that a zarr-like chunksize was given, or a full chunks spec was already given, this is all unnecessary work. This PR skips it for such cases. The same part of the profile following the change looks like: