Skip to content

OneHotEncoder does not output scipy sparse matrix of given dtype #11034

@DanielMorales9

Description

@DanielMorales9

Description

OneHotEncoder ignores the specified dtype in the construction of the sparse array when mixed input data are passed, i.e with both categorical and real data type

Steps/Code to Reproduce

import numpy as np

from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(dtype=np.float32, categorical_features=[0, 1])

x = np.array([[0, 1, 0, 0], [1, 2, 0, 0]], dtype=int)
sparse = enc.fit(x).transform(x)

Expected Results

sparse: <2x6 sparse matrix of type '<class 'numpy.float32'>'
	with 4 stored elements in COOrdinate format>

Actual Results

sparse: <2x6 sparse matrix of type '<class 'numpy.float64'>'
	with 4 stored elements in COOrdinate format>

Versions

Platform: Linux-4.13.0-38-generic-x86_64-with-debian-stretch-sid
Python: 3.6.3 |Anaconda custom (64-bit)| (default, Oct 13 2017, 12:02:49) [GCC 7.2.0]
NumPy: NumPy
SciPy: SciPy 1.0.1
Scikit-Learn: Scikit-Learn 0.19.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions