Skip to content

Partially fix cumprod#7089

Merged
jsignell merged 13 commits intodask:masterfrom
jsignell:fix-cumprod
Jan 22, 2021
Merged

Partially fix cumprod#7089
jsignell merged 13 commits intodask:masterfrom
jsignell:fix-cumprod

Conversation

@jsignell
Copy link
Copy Markdown
Member

@jsignell jsignell commented Jan 20, 2021

  • Tests passed
  • Passes black dask / flake8 dask

This is on top of #7084 but just the last commit contains a change. It's not a great fix, but basically, just make sure that npartitions of the flattened array matches input and then expect certain combinations to raise warnings. I could add a custom error in _cumprod_merge but this case feels really really niche, so I didn't think that was necessarily appropriate.

@jsignell jsignell mentioned this pull request Jan 20, 2021
1 task
@jsignell
Copy link
Copy Markdown
Member Author

Failure seems unrelated

@jsignell jsignell marked this pull request as draft January 21, 2021 16:16
Copy link
Copy Markdown
Member Author

@jsignell jsignell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrbourbeau are you ok with the change in this commit 9831f4e? The goal is to make it so the user at least has some control over the chunksize to avoid the cumprod issues discussed in #7084

runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
fail-fast: false
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change these back before merge

@jsignell
Copy link
Copy Markdown
Member Author

jsignell commented Jan 21, 2021

Ok so the last failure is a pickle error:

______________________ test_futures_to_delayed_dataframe _______________________

c = <Client: 'tcp://127.0.0.1:38987' processes=2 threads=2, memory=14.58 GB>

    def test_futures_to_delayed_dataframe(c):
        pd = pytest.importorskip("pandas")
        dd = pytest.importorskip("dask.dataframe")
        df = pd.DataFrame({"x": [1, 2, 3]})
    
        futures = c.scatter([df, df])
>       ddf = dd.from_delayed(futures)

dask/tests/test_distributed.py:76: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
dask/dataframe/io/io.py:593: in from_delayed
    meta = delayed(make_meta)(dfs[0]).compute()
dask/base.py:279: in compute
    (result,) = compute(self, traverse=False, **kwargs)
dask/base.py:561: in compute
    results = schedule(dsk, keys, **kwargs)
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/distributed/client.py:2684: in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/distributed/client.py:1993: in gather
    return self.sync(
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/distributed/client.py:839: in sync
    return sync(
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/distributed/utils.py:340: in sync
    raise exc.with_traceback(tb)
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/distributed/utils.py:324: in f
    result[0] = yield future
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/tornado/gen.py:762: in run
    value = future.result()
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/distributed/client.py:1858: in _gather
    raise exception.with_traceback(traceback)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   return pickle.loads(x)
E   AttributeError: type object '_DTypeMeta' has no attribute '_abstract'

from https://github.com/dask/dask/pull/7089/checks?check_run_id=1744395339

@QuLogic do you have ideas about this one?

@jsignell jsignell marked this pull request as ready for review January 22, 2021 16:08
@jsignell
Copy link
Copy Markdown
Member Author

@jrbourbeau I think this one is good to merge if it passes CI. It doesn't quite finish with the upstream fixes, but it's more steps in the right direction.

@jsignell jsignell merged commit 450e9fd into dask:master Jan 22, 2021
@jsignell jsignell deleted the fix-cumprod branch January 22, 2021 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants