-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Describe the bug
I am trying to classify a multivariate dataset using a ColumnEnsembleClassifier. I use the same classifier for all dimensions.
It fails to fit the dataset when I use KNN classifiers, but works when I use TimeSeriesForestClassifier.
I tried KNN with different metrics ('msm, 'dtw' 'euclidean' -> imported from scipy'), all of them fail
I tried it with AtrialFibrillation and BasicMotions datasets, it fails for both.
To Reproduce
from sktime.utils.data_io import load_from_arff_to_dataframe as load_arff
X_train, y_train = load_arff(path_to_dataset_TRAIN.arff')
from sktime.classification.compose import ColumnEnsembleClassifier
from sktime.classification.distance_based._time_series_neighbors import KNeighborsTimeSeriesClassifier
from sktime.classification.compose import TimeSeriesForestClassifier
########## This one works fine ##########
clf= ColumnEnsembleClassifier(estimators=[
('TSF_0',TimeSeriesForestClassifier(verbose=0,n_jobs=-1),[0]),
('TSF_1',TimeSeriesForestClassifier(verbose=0,n_jobs=-1),[1])
])
clf.fit(X_train, y_train)
ColumnEnsembleClassifier(estimators=[
('TSF_0',TimeSeriesForestClassifier(n_jobs=-1),[0]),
('TSF_1',TimeSeriesForestClassifier(n_jobs=-1),[1])
])
########## This one fails ##########
clf= ColumnEnsembleClassifier(estimators=[
('1NN-MSM_0', KNeighborsTimeSeriesClassifier(metric='msm'), [0]),
('1NN-MSM_1', KNeighborsTimeSeriesClassifier(metric='msm'), [1])
])
clf.fit(X_train, y_train)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\Desktop\OVGU\DKE Subjects\Master Thesis\Code\run.py in <module>
----> 1 clf.fit(X_train, y_train)
~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sktime\classification\compose\_column_ensemble.py in fit(self, X, y)
157 for name, estimator, column in self._iter(replace_strings=True):
158 estimator = clone(estimator)
--> 159 estimator.fit(_get_column(X, column), transformed_y)
160 estimators_.append((name, estimator, column))
161
~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sktime\classification\distance_based\_time_series_neighbors.py in fit(self, X, y)
243 check_array.__code__ = _check_array_ts.__code__
244
--> 245 fx = self._fit(X)
246
247 if hasattr(check_array, "__wrapped__"):
~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sklearn\neighbors\_base.py in _fit(self, X, y)
362 if not isinstance(X, (KDTree, BallTree, NeighborsBase)):
363 X, y = self._validate_data(X, y, accept_sparse="csr",
--> 364 multi_output=True)
365
366 if is_classifier(self):
~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
413 if self._get_tags()['requires_y']:
414 raise ValueError(
--> 415 f"This {self.__class__.__name__} estimator "
416 f"requires y to be passed, but the target y is None."
417 )
ValueError: This KNeighborsTimeSeriesClassifier estimator requires y to be passed, but the target y is None.Expected behavior
ColumnEnsembleClassifier should fit the training data
Versions
Details
System:
python: 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)]
executable: C: ...\....\.virtualenvs\Code-OH4xsw-D\Scripts\python.exe
machine: Windows-10-10.0.18362-SP0
Python dependencies:
pip: 20.3.1
setuptools: 50.3.2
sklearn: 0.24.0
sktime: 0.5.0
statsmodels: 0.12.1
numpy: 1.19.4
scipy: 1.5.4
Cython: 0.29.21
pandas: 1.1.5
matplotlib: 3.3.3
joblib: 1.0.0
numba: 0.52.0
pmdarima: None
tsfresh: 0.17.0