Skip to content

[WIP] ENH Adds tree reduction to bincount#7183

Merged
jsignell merged 5 commits intodask:masterfrom
thomasjpfan:bincount_tree_reduction_rb
Feb 24, 2021
Merged

[WIP] ENH Adds tree reduction to bincount#7183
jsignell merged 5 commits intodask:masterfrom
thomasjpfan:bincount_tree_reduction_rb

Conversation

@thomasjpfan
Copy link
Contributor

This implementation is not faster with this benchmark:

import dask.array as da
import numpy as np

rng = np.random.RandomState(42)
x = rng.randint(5000, size=10_000_000)
x_da = da.from_array(x, chunks=300_000)
out = da.bincount(x_da)
%%timeit
_ = out.compute()

This PR

26.1 ms ± 2.06 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Master

21.2 ms ± 272 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Maybe there is not a benefit to tree reduction here?

@jsignell
Copy link
Member

Huh. If there is no improvement do you think it's worth including this? Or do you think there'd be an improvement with a different benchmark?

@thomasjpfan
Copy link
Contributor Author

thomasjpfan commented Feb 23, 2021

I think the benefit of this PR is memory usage for larger arrays:

import dask.array as da
import dask
from dask.diagnostics import ResourceProfiler
import numpy as np

rng = np.random.RandomState(42)
x = rng.randint(50000, size=500_000_000)
x_da = da.from_array(x, chunks=100_000)
out = da.bincount(x_da)

with ResourceProfiler(dt=0.01) as rprof:
    out.compute()
    
rprof.visualize()

On master: Goes to ~ 7.4 GB

bokeh_plot-2

This PR: Stays at ~ 4.7GB

bokeh_plot

master is always a little faster but it uses more memory. On my machine, the CPU utilization is not as great with my implementation, I have a feeling it is how I am using blockwise. I think going back to constructing the Highlevelgraph by hand may be better.

Edit: The dask profilers are awesome!

@thomasjpfan thomasjpfan changed the title ENH Adds tree reduction to bincount [WIP] ENH Adds tree reduction to bincount Feb 23, 2021
@jsignell
Copy link
Member

Oh that is a huge memory improvement! Thanks for taking the time to write that up :)

for i, _ in enumerate(x.__dask_keys__())
}
dtype = np.bincount([1], weights=[1]).dtype
meta = meta_from_array(weights)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the meta should be taken from some small version of the bincount like how dtype was before:

Suggested change
meta = meta_from_array(weights)
meta = np.bincount([1], weights=[1])

Here's a test for this case:

def test_bincount_with_int_weights():
    x = np.array([2, 1, 5, 2, 1])
    d = da.from_array(x, chunks=2)
    weights = np.array([1, 2, 1, 0, 1])

    dweights = da.from_array(weights, chunks=2)
    e = da.bincount(d, weights=dweights, minlength=6)
    assert_eq(e, np.bincount(x, weights=dweights.compute(), minlength=6))
    assert same_keys(da.bincount(d, weights=dweights, minlength=6), e)

np.array([1, 2, 1, 0.5, 1], dtype=np.float32),
np.array([1, 2, 1, 0, 1], dtype=np.int32),
],
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@thomasjpfan
Copy link
Contributor Author

Same benchmark with Profiler included:

On master

Screen Shot 2021-02-23 at 3 56 11 PM

This PR

Screen Shot 2021-02-23 at 3 55 41 PM

My guess is that there is more gil contention when running _bincount_agg and np.bincount together.

@jsignell
Copy link
Member

This seems reasonable to me. Thanks for doing this work!

@jsignell jsignell merged commit bf1c65c into dask:master Feb 24, 2021
@jakirkham
Copy link
Member

Thanks Thomas! 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use tree reduction in da.bincount

3 participants