Skip to content

ColumnTransformer should always apply sparse_threshold #12150

@amueller

Description

@amueller

I hate to do this, but I think we should change the behavior of sparse_threshold for the final release.
I think it's unnatural to not apply this when all matrices are sparse. In the (not uncommon) case that all columns are categorical it's easy to get a sparse array otherwise because the default of OneHotEncoder is sparse.

This is an issue (similar to #12071) that pops up when building a general pipeline to be applied to several datasets. Right now the presence of a single continuous feature changes whether the output will be sparse or not, while that's not really relevant, I think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions