Skip to content

RFC Behavior of OneHotEncoder on binary features #12502

@amueller

Description

@amueller

Right now OneHotEncoder expands every bindary feature into two features afaik.
I'm not sure if this is a great/convenient behavior. I think it might be nicer if (optionally?) it'd use a single column - that's particularly natural if the feature was already 0 and 1
Otherwise that either makes the ColumnTransformer people have to use more complicated, or creates redundant features.
One might argue that one possible cure for that is the option to drop one of the indicator variables. I'm not really sure if that's what I want, though. In my mind having a base category is more interpretable in the binary case than in the multinomial case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions