Description
DecisionTreeClassifier crashes with unknown label type: 'continuous-multioutput'. I've tried loading csv file using csv.reader, pandas.read_csv and some other stuff like parsing line-by-line.
Steps/Code to Reproduce
from sklearn import tree
feature_df = pd.read_csv(os.path.join(_PATH, 'features.txt'))
target_df = pd.read_csv(os.path.join(_PATH, 'target.txt'))
feature_df = feature_df._get_numeric_data()
target_df = target_df._get_numeric_data()
feature_df = feature_df.fillna(0)
target_df = target_df.fillna(0)
clf = tree.DecisionTreeClassifier()
clf_o = clf.fit(feature_df, target_df)
features.txt
target.txt
Expected Results
Error thrown informs user what REALLY is wrong, that f.e. his data set does not folllow assumptions (and what are those)
Actual Results
Traceback (most recent call last):
File "D:\Piotr\Documents\uni\bap\BAPFingerprintLocalisation\main.py", line 19,
in <module>
decision_tree.treeClassification()
File "D:\Piotr\Documents\uni\bap\BAPFingerprintLocalisation\code\decision_tree
.py", line 56, in treeClassification
clf_o = clf.fit(feature_df, target_df)
File "C:\Python35\lib\site-packages\sklearn\tree\tree.py", line 182, in fit
check_classification_targets(y)
File "C:\Python35\lib\site-packages\sklearn\utils\multiclass.py", line 172, in
check_classification_targets
raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'continuous-multioutput'
Versions
Windows-10-10.0.14393-SP0
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)]
NumPy 1.11.0
SciPy 0.17.1
Scikit-Learn 0.18
Update:
I've changed number of target variables to one, just to simplify things
clf_o = clf.fit(feature_df, target_df.ix[:,1])
Output: Unknown label type: 'continuous'
Description
DecisionTreeClassifier crashes with
unknown label type: 'continuous-multioutput'. I've tried loading csv file using csv.reader, pandas.read_csv and some other stuff like parsing line-by-line.Steps/Code to Reproduce
features.txt
target.txt
Expected Results
Error thrown informs user what REALLY is wrong, that f.e. his data set does not folllow assumptions (and what are those)
Actual Results
Versions
Update:
I've changed number of target variables to one, just to simplify things
Output:
Unknown label type: 'continuous'