-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
MissingIndicator failed with non-numeric inputs #13035
Copy link
Copy link
Closed
Description
Description
sklearn.Imputer.MissingIndicator fails with string and object type numpy arrays
String Types
Steps/Code to Reproduce
import numpy as np
from sklearn.impute import MissingIndicator
a = np.array([[c] for c in 'abcdea'], dtype=str)
MissingIndicator().fit_transform(a)
MissingIndicator(missing_values='a').fit_transform(a)Expected Results
[[False]
[False]
[False]
[False]
[False]
[False]]
[[False]
[False]
[True]
[False]
[False]
[False]]
Actual Results
C:\Users\snowt\Python\scikit-learn\env\Scripts\python.exe C:/Users/snowt/Python/scikit-learn/test.py
[[False]
[False]
[False]
[False]
[False]
[False]]
C:\Users\snowt\Python\scikit-learn\sklearn\utils\validation.py:558: FutureWarning: Beginning in version 0.22, arrays of bytes/strings will be converted to decimal numbers if dtype='numeric'. It is recommended that you convert the array to a float dtype before using it in scikit-learn, for example by using your_array = your_array.astype(np.float64).
FutureWarning)
C:\Users\snowt\Python\scikit-learn\sklearn\utils\validation.py:558: FutureWarning: Beginning in version 0.22, arrays of bytes/strings will be converted to decimal numbers if dtype='numeric'. It is recommended that you convert the array to a float dtype before using it in scikit-learn, for example by using your_array = your_array.astype(np.float64).
FutureWarning)
C:\Users\snowt\Python\scikit-learn\sklearn\utils\validation.py:558: FutureWarning: Beginning in version 0.22, arrays of bytes/strings will be converted to decimal numbers if dtype='numeric'. It is recommended that you convert the array to a float dtype before using it in scikit-learn, for example by using your_array = your_array.astype(np.float64).
FutureWarning)
Traceback (most recent call last):
File "C:/Users/snowt/Python/scikit-learn/test.py", line 7, in <module>
print(MissingIndicator(missing_values='a').fit_transform(a))
File "C:\Users\snowt\Python\scikit-learn\sklearn\impute.py", line 634, in fit_transform
return self.fit(X, y).transform(X)
File "C:\Users\snowt\Python\scikit-learn\sklearn\impute.py", line 570, in fit
if self.features == 'missing-only'
File "C:\Users\snowt\Python\scikit-learn\sklearn\impute.py", line 528, in _get_missing_features_info
imputer_mask = _get_mask(X, self.missing_values)
File "C:\Users\snowt\Python\scikit-learn\sklearn\impute.py", line 52, in _get_mask
return np.equal(X, value_to_mask)
TypeError: ufunc 'equal' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('bool')
Process finished with exit code 1
Object Types
Steps/Code to Reproduce
import numpy as np
from sklearn.impute import MissingIndicator
a = np.array([[c] for c in 'abcdea'], dtype=object)
MissingIndicator().fit_transform(a)
MissingIndicator(missing_values='a').fit_transform(a)Expected Results
[[False]
[False]
[False]
[False]
[False]
[False]]
[[False]
[False]
[True]
[False]
[False]
[False]]
Actual Results
C:\Users\snowt\Python\scikit-learn\env\Scripts\python.exe C:/Users/snowt/Python/scikit-learn/test.py
Traceback (most recent call last):
File "C:/Users/snowt/Python/scikit-learn/test.py", line 6, in <module>
print(MissingIndicator().fit_transform(a))
File "C:\Users\snowt\Python\scikit-learn\sklearn\impute.py", line 634, in fit_transform
return self.fit(X, y).transform(X)
File "C:\Users\snowt\Python\scikit-learn\sklearn\impute.py", line 555, in fit
force_all_finite=force_all_finite)
File "C:\Users\snowt\Python\scikit-learn\sklearn\utils\validation.py", line 522, in check_array
array = np.asarray(array, dtype=dtype, order=order)
File "C:\Users\snowt\Python\scikit-learn\env\lib\site-packages\numpy\core\numeric.py", line 538, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: could not convert string to float: 'a'
Process finished with exit code 1
Versions
System:
python: 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64
bit (AMD64)]
executable: C:\Users\snowt\Python\scikit-learn\env\Scripts\python.exe
machine: Windows-10-10.0.17763-SP0
BLAS:
macros:
lib_dirs:
cblas_libs: cblas
Python deps:
pip: 18.1
setuptools: 40.6.3
sklearn: 0.21.dev0
numpy: 1.16.0
scipy: 1.2.0
Cython: 0.29.3
pandas: None
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels