Skip to content

Closes #436#437

Merged
TomAugspurger merged 7 commits intodask:masterfrom
ZEFR-INC:master
Dec 19, 2018
Merged

Closes #436#437
TomAugspurger merged 7 commits intodask:masterfrom
ZEFR-INC:master

Conversation

@ryan-deak-zefr
Copy link
Copy Markdown
Contributor

Closes #436.

To fix, set allow_unknown_chunksizes=True by default in dask_ml.compose.ColumnTransformer._hstack.

elif da.Array in types:
return da.hstack(Xs)
# To avoid changing dask core, this is the definition of the
# dask.array.hstack inlined:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're releasing fairly frequently these days. If you're willing I encourage you to submit a small PR that passes allow_unknown_chunksizes through the hstack vstack, ... functions in dask.array. That seems like it would benefit other applications as well.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created this PR to push this forward: dask/dask#4287

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vecchp.

I think we'll keep this compatibility code to allow us to pin our minimum bound at dask 1.0 for a while.

@ryan-deak-zefr could you update the comment to reflect that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger: Please let me know if the current comment is sufficient.

Copy link
Copy Markdown
Member

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping to have all the CI issues sorted out tomorrow.

elif da.Array in types:
return da.hstack(Xs)
# To avoid changing dask core, this is the definition of the
# dask.array.hstack inlined:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vecchp.

I think we'll keep this compatibility code to allow us to pin our minimum bound at dask 1.0 for a while.

@ryan-deak-zefr could you update the comment to reflect that?

@TomAugspurger
Copy link
Copy Markdown
Member

Looks good.

I've pushed a commit merging master to your branch. CI should pass now.

@TomAugspurger TomAugspurger merged commit cfec774 into dask:master Dec 19, 2018
@TomAugspurger
Copy link
Copy Markdown
Member

Thanks!

@ryan-deak-zefr
Copy link
Copy Markdown
Contributor Author

Thank @TomAugspurger!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow dask arrays to be hstacked in ColumnTransformer

4 participants